Lack of sales/interest due to slow hardware?

Posted Aug 19, 2011 17:18 UTC (Fri) by cmccabe (guest, #60281)
In reply to: Lack of sales/interest due to slow hardware? by pr1268
Parent article: HP dropping webOS devices

WebOS ran slow and power-hungry due to being based around Javascript, a language that was never designed for high efficiency.

I guess the theory was that being based around HTML/CSS/Javascript would make it easier for developers to hop on board. But the reality is, WebOS got a late start and didn't really offer anything to developers that other platforms didn't already.

The business model was also pretty questionable. HP is a company built around selling lots and lots of hardware. Did they really expect other OEMs to enthusiastically jump on board their proprietary OS, knowing they would be competing directly with HP?

The only surprise here is that it took HP so long to reach the obvious conclusion and focus their resources elsewhere.

Lack of sales/interest due to slow hardware?

Posted Aug 19, 2011 18:35 UTC (Fri) by tstover (guest, #56283) [Link] (57 responses)

Yes the HTML/CSS/Javascript "mobile application" notion was not good. However the idea to support things like C & SDL beats gross Java any day.

Lack of sales/interest due to slow hardware?

Posted Aug 19, 2011 19:29 UTC (Fri) by cmccabe (guest, #60281) [Link] (56 responses)

You can use C on Android with the NDK.

Most mobile applications are all about the graphical user interface, though. Writing graphical user interfaces in C is not really a good choice.

I never understood the hate for Java. The mobile platforms insulate you from the most irritating parts, like JVM startup time, classpath, bundled libraries, JVM differences, and old-fashioned UI toolkits.

Java is new Tk

Posted Aug 20, 2011 13:37 UTC (Sat) by khim (subscriber, #9252) [Link] (55 responses)

It's very easy to create "passable UI" with Java. It's hard to create "great UI" with Java.

GC is primary reason. It actively encourages designs which lead to slow, sloppy UI - and you can not fix it without basically full redesign of everything. The only solution is to paper over problem using huuuugely excessive horsepower (basically if you'll throw on Java 10x CPU power and 10x memory as compared to sane languages you'll get the same experience).

Note that Apple supports GC on Macs but does not support them in iOS: it knows iPhone/iPad are not yet powerful enough for that.

Java is new Tk

Posted Aug 20, 2011 15:42 UTC (Sat) by cmccabe (guest, #60281) [Link] (54 responses)

Java on the desktop got a bad reputation for a bunch of reasons. It took a lot of time for the JVM to start up. Java desktop UI toolkits always looked amateurish and out of place on Windows and Mac. For whatever reason, Sun never came out with an incremental garbage collector for desktop use, so GC pauses were a real hassle.

All of these problems have been fixed. The new version of Android has incremental garbage collection. This helps a lot with the responsiveness of the UI. JVM start up time is not an issue because the JVM is always started. The UI toolkit is fully native rather than just being an ugly shim on top of something else.

In fact, nearly all UI development today is done in garbage-collected languages. On the desktop, you have .NET, Javascript, and HTML; on the server-side, you have more .NET, Java, Ruby, and Python. Java and .NET are actually the fastest of the bunch.

There are still some people developing UIs in C++ or C on the desktop. The main reason you would do this is because the rest of the app needs to be in C++ or C for speed reasons.

Typical Android handsets tend to clock in at somewhere between 800 MHz and 1.2 GHz today. Apple's flagship device is still at 800 MHz (I think?) Experience suggests that you don't need "10x horsepower" to use GC.

iPhones still do tend to get better battery life than android handsets. I expect this difference to narrow as users start expecting their phones to do more. For example, up until recently the iPhone had no way to run user-installed background proceseses. But due to user demand, and the fact that Android had one, they had to add it in.

Java is new Tk

Posted Aug 21, 2011 6:58 UTC (Sun) by tzafrir (subscriber, #11501) [Link] (5 responses)

Most of the other "languages" (or rather: their implementations) that you mention as using garbage collection use a reference counting one, IIRC:

Javascript, Ruby (most implementations - except, maybe, JRuby and IronRuby), Python (except JPython?), etc. C++ likewise has its own reference counting partial garbage collection facility.

With reference counting there's no inherent need for a single pass (or an elaborate way to work around the need for one).

Refcounting is fine...

Posted Aug 21, 2011 8:39 UTC (Sun) by khim (subscriber, #9252) [Link] (1 responses)

With refcounting you still must think about objects lifetime. Thus you still will create sane designs because you must make sure you'll not create loops. But full GC encourages "don't know and don't care when this object will be removed" designs - and these can only be fixed with full rewrite.

Refcounting is fine...

Posted Aug 31, 2011 14:25 UTC (Wed) by nix (subscriber, #2304) [Link]

Er, recursive reference loops are common in all sorts of designs. In pure refcounted ones, you have to use crocks like having a single object registry and having everything else have references to it, even if that is otherwise unnatural. A good few refcounted implementations (including Python and IIRC Ruby too) collect cyclic garbage just fine, by having a conventional GC as well (in Python's case, a generational GC with a variety of optimizations to take into account that most garbage is already collected by the refcounter, so most of what it will see is uncollectable).

Java is new Tk

Posted Aug 27, 2011 18:30 UTC (Sat) by BenHutchings (subscriber, #37955) [Link] (2 responses)

Reference counting can't deal with cycles. Some languages just bail on this (e.g. Perl) but most use a real GC. Even though CPython does reference counting it also has a real GC to break cycles. In a web browser, Javascript code is not trusted, so any implementation must implement real GC.

Reference counting can also result in long pauses due to cascading destruction. In fact, it can be worse than incremental GC in this respect.

Java is new Tk

Posted Aug 27, 2011 23:47 UTC (Sat) by andresfreund (subscriber, #69562) [Link] (1 responses)

Its not like its impossible to delay destruction in a refcounted scheme. Doing so should be rather simple.
I guess the point is that predictable deletion has nice properties to implement things in an RAIIish fashion which probably outweighs the problems of expensive cascading deletion.

Java is new Tk

Posted Aug 28, 2011 23:03 UTC (Sun) by foom (subscriber, #14868) [Link]

The problem is that it's not actually predictable. It only *seems* predictable. But you could create a reference cycle pointing to your object, and all of a sudden, whoops, doesn't get destructed when you thought it was going to anymore.

There's a reason that Python introduced the "with" statement: to allow for easily-written *actually* predictable resource closing.

Well, it's nice to ignore facts...

Posted Aug 21, 2011 8:35 UTC (Sun) by khim (subscriber, #9252) [Link] (47 responses)

In fact, nearly all UI development today is done in garbage-collected languages.

...with just a single one, yet significant exception: Apple. It actively fights GC-disease on iOS. There are other, smaller bastions: Microsoft Office, for example. And every time you see "latest and greatest" Java or .NET-based UI you see "chopping and laggy", when you see something developed using "old-school non-GC-based approach" you see "slick and beautiful". Coincidence? I think not.

Typical Android handsets tend to clock in at somewhere between 800 MHz and 1.2 GHz today. Apple's flagship device is still at 800 MHz (I think?) Experience suggests that you don't need "10x horsepower" to use GC.

Note that typical Android handset has dual-core CPU already while Apple is using slower single-core CPU. And still the experience of iOS is "smooth butter" while Android's experience is "choppy and laggy". I guess when Android handsets will get four-core 2GHz CPUs they will finally be able to reach the level of responsiveness of single-core 800MHz Apple's CPU. This is exactly 10x the horsepower :-)

There are still some people developing UIs in C++ or C on the desktop. The main reason you would do this is because the rest of the app needs to be in C++ or C for speed reasons.

No. The main reason is that you want something good-looking on lower-end systems. Note how Visual Studio (which was initially in C/C++) is slowly morphing in .NET-based monster while Microsoft Office team fights .NET tooth and nail. That's because MS Office should work great on low-end systems, not just on 16-core/16GiB monsters.

Well, it's nice to ignore facts...

Posted Aug 21, 2011 12:16 UTC (Sun) by alankila (guest, #47141) [Link] (46 responses)

I've been contracted to make an android app that is largely based on OpenGL. It displays textures of variable sizes from 1024x1024 to 2048x2048 textures, all in 16 bits per pixel, which it decodes from JPEG asynchronously. These textures consume from 2 to 8 MB, and between downloading them from the web, decoding JPEG to RGB bitmap, and uploading these bitmaps to the GPU, there is fair bit of memory and CPU pressure concurrently while the app itself presents a smoothly running OpenGL animation.

I'm testing the application on Galaxy S. No choppiness is evident. According to logcat, GC pause lengths vary from < 10 ms to 37 ms, majority of the pauses being 20 ms. I don't notice them at all. There is unfortunately one thing that makes an app of this type a bit choppy: sometimes a new texture must be uploaded to the GPU, and it must be done from the rendering thread, because it has the only OpenGL context! And for the 2048x2048 (8 MB) texture, this operation takes something like 60 ms to do, and you can clearly see how the app misses a beat there.

I've also written the iOS version of the app. It behaves virtually the same, as the architecture is of course the same. JPEG decoding is done in an NSOperationQueue which is backed by a private thread, and the texture upload is inline, as this is how it has to be. On iOS there's also a slight pause in the animation during the texture upload.

However, iOS version was much harder to write because objC was new to me, and xcode 4 is fairly buggy and crashes quite often, and then there's all that additional complexity around object allocation initialization, and autorelease pools and references that you need to explicitly care for. The class files are longer than their java counterpart, and the need to free every little object explicitly adds a chance for making mistakes. I really don't believe in requiring programmers to forcibly care about this sort of stuff when most of these allocations that add to the programming-type complexity are so small that it doesn't even matter if they hang around for a while and are taken care of by GC at some convenient time later.

To summarize: no, I don't think GC is an issue anymore. Some API issues like the possibility for only synchronous texture uploads are far more important.

Well, it's nice to ignore facts...

Posted Aug 21, 2011 15:26 UTC (Sun) by endecotp (guest, #36428) [Link] (12 responses)

> sometimes a new texture must be uploaded to the GPU, and it must
> be done from the rendering thread, because [on Android] it has the
> only OpenGL context!

My understanding is that in principle you can have multiple OpenGL contexts on Android, but in practice (according to gurus on the Android forums) this is likely to be unreliable in practice due to implementation bugs. Since there are many different OpenGL implementations on Android - one per GPU vendor - you would need to do a lot of testing before trying to use this.

(This was one of several bad things that I discovered about Android when I experimented with it last year; another was that the debugger didn't work with multi-threaded programs. I was actually quite shocked by how poor at lot of it was once I poked below the surface. I believe some things have got better in the meantime.)

> On iOS there's also a slight pause in the animation during the
> texture upload.

On iOS, you can definitely have multiple OpenGL contexts, one per thread, both in theory and in practice.

> objC was new to me, and xcode 4 is fairly buggy and crashes quite
> often, and then there's all that additional complexity around object
> allocation initialization, and autorelease pools and references that
> you need to explicitly care for. The class files are longer than their
> java counterpart, and the need to free every little object explicitly
> adds a chance for making mistakes. I really don't believe in requiring
> programmers to forcibly care about this sort of stuff when most of
> these allocations that add to the programming-type complexity are so
> small that it doesn't even matter if they hang around for a while and
> are taken care of by GC at some convenient time later.

Right, I agree. So does Apple, and they now have GC on Mac OS and "automatic reference counting" on iOS.

However, my suggestion would be to use C++ with smart pointers. This has the advantage of working on every platform that you might ever want to use - even WebOS! - but not Windows Mobile.

Well, it's nice to ignore facts...

Posted Aug 21, 2011 16:16 UTC (Sun) by alankila (guest, #47141) [Link]

Thanks for swinging the cluebat. You seem to be right on the possibility for multiple GL contexts on iOS. They also say that android 3.0 and onwards should properly support this feature.

Well, it's nice to ignore facts...

Posted Aug 22, 2011 1:52 UTC (Mon) by cmccabe (guest, #60281) [Link] (10 responses)

> (This was one of several bad things that I discovered about Android when I
> experimented with it last year; another was that the debugger didn't work
> with multi-threaded programs. I was actually quite shocked by how poor at
> lot of it was once I poked below the surface. I believe some things have
> got better in the meantime.)

Multi-threaded debugging has always worked fine for pure Java code on Android. Debugging multi-threaded native code (aka NDK code, aka C/C++ code), is broken on Android 2.2 but it works on Android 2.3 and above.

> However, my suggestion would be to use C++ with smart pointers. This has
> the advantage of working on every platform that you might ever want to use
> - even WebOS! - but not Windows Mobile

Um, it depends on what "every platform you might ever want to use" is. Neither C nor C++ are supported at all on Blackberry or Windows Phone 7.

Android supports C and C++ through the NDK. However, the older NDK kits do not support C++ exceptions. There are reports that some people have gotten exceptions to work with the older NDK by including glibc and libstdc++ in their application, but that increases the application size by many megabytes.

Without exceptions, you cannot use std::tr1::shared_ptr, which is more or less the standard smart pointer in the C++ world these days. Most of the stuff in the STL uses exception too, which is inconvenient to say the least.

There is this thing called Objective C++ that you can use on iOS if you want. However, that is not necessarily a good idea. Basically, Apple views Objective C as replacement for C++, and only supports Objective C++ for compatibility reasons.

Well, it's nice to ignore facts...

Posted Aug 22, 2011 16:25 UTC (Mon) by endecotp (guest, #36428) [Link]

> Debugging multi-threaded native code (aka NDK code, aka C/C++
> code), is broken on Android 2.2 but it works on Android 2.3 and
> above.

Right. I own 4 Android devices, and currently only one of them has >=2.3 available; that's my Galaxy Tab, and Samsung released the update just a few weeks ago. So on my other 3 devices I can still only do "printf-style" debugging. My Motorola Defy has only just got an update to 2.2!

It's actually even worse than that; on the 2.2 Galaxy Tab some vital symlink or somesuch was missing, which made even single-threaded native debugging impossible.

> the older NDK kits do not support C++ exceptions.

Right, that's one of the other surprising "bad things" that I was referring to. I was able to work around it by installing a hacked version of the tools from crystax.net.

> There is this thing called Objective C++ that you can use on
> iOS if you want.

I'm very familiar with it :-)

> However, that is not necessarily a good idea. Basically, Apple views
> Objective C as replacement for C++, and only supports Objective C++
> for compatibility reasons.

"Citation Required".

Well, it's nice to ignore facts...

Posted Aug 23, 2011 10:44 UTC (Tue) by jwakely (subscriber, #60262) [Link] (8 responses)

Without exceptions, you cannot use std::tr1::shared_ptr, which is more or less the standard smart pointer in the C++ world these days. Most of the stuff in the STL uses exception too, which is inconvenient to say the least.

Both boost::shared_ptr and GCC's tr1::shared_ptr can be used without exceptions. Failed memory allocations will abort. The only other throwing operation is converting a weak_ptr to a shared_ptr, which can be replaced by calling weak_ptr::lock() which is non-throwing.

GCC's C++ standard library can be used with -fno-exceptions and I'd be very surprised if other implementations don't have something equivalent. In normal use there are few places where the C++ Standard Library throw exceptions, and they can often be avoided by checking preconditions first (e.g. don't call std::vector::at() without checking the index isn't out of range first)

Well, it's nice to ignore facts...

Posted Aug 23, 2011 19:24 UTC (Tue) by cmccabe (guest, #60281) [Link] (7 responses)

> Both boost::shared_ptr and GCC's tr1::shared_ptr can be used without
> exceptions. Failed memory allocations will abort. The only other throwing
> operation is converting a weak_ptr to a shared_ptr, which can be replaced
> by calling weak_ptr::lock() which is non-throwing.

That is technically true, but a little bit misleading.

Code using tr1::shared_ptr will not compile without support for RTTI. Now, you could enable RTTI without enabling exceptions, but nobody actually does, because RTTI requires exceptions in order to function in any reasonably sane way. Otherwise, the entire program aborts when a dynamic_cast to a reference type fails. And I don't think even the most die-hard C++ advocate could put a positive spin on that.

Realizing this, Google compiled their old libc without support for exceptions or RTTI. So you will not be able to use shared_ptr with the old NDK, only with the new one-- sorry.

There is talk of removing the dependency on RTTI from tr1::shared_ptr. But of course that will take years to be agreed on by everyone and rolled out, assuming that it goes forward.

Well, it's nice to ignore facts...

Posted Aug 23, 2011 20:09 UTC (Tue) by cmccabe (guest, #60281) [Link] (6 responses)

> Realizing this, Google compiled their old libc without support
> for exceptions or RTTI. So you will not be able to use shared_ptr
> with the old NDK, only with the new one-- sorry.

er, I meant libstdc++

Well, it's nice to ignore facts...

Posted Aug 24, 2011 3:00 UTC (Wed) by njs (subscriber, #40338) [Link]

Man, if only there were a full-featured free-software C/C++ standard library that they could have used to avoid this whole mess.

Well, it's nice to ignore facts...

Posted Aug 24, 2011 21:53 UTC (Wed) by jwakely (subscriber, #60262) [Link] (4 responses)

I fixed that nearly two years ago: http://gcc.gnu.org/PR42019

Well, it's nice to ignore facts...

Posted Aug 24, 2011 22:12 UTC (Wed) by jwakely (subscriber, #60262) [Link] (3 responses)

There is talk of removing the dependency on RTTI from tr1::shared_ptr. But of course that will take years to be agreed on by everyone and rolled out, assuming that it goes forward.

What talk exactly? You know TR1 is finished, right? It is what it is, there will be no more changes to the document. But if you want changes to libstdc++'s implementation of tr1::shared_ptr, just ask me, if it's reasonable I'll consider it.

But please stop making misleading comments about C++ that ignore facts. As the bug I linked to shows, it didn't take years to agree on, it took 8 days.

Well, it's nice to ignore facts...

Posted Aug 24, 2011 22:38 UTC (Wed) by jwakely (subscriber, #60262) [Link]

And if you're *only* talking about the Android NDK version of libstdc++, I don't see why it should take years to backport a simple fix from upstream.

Well, it's nice to ignore facts...

Posted Sep 4, 2011 20:45 UTC (Sun) by cmccabe (guest, #60281) [Link] (1 responses)

Hi jwakely,

I did not mean to imply that the libstdc++ maintainers were slow. However, rollout of new libstdc++ versions can be quite delayed, as you know. Using shared_ptr without exceptions on older Android versions just isn't going to compile, and it would be misleading to suggest otherwise. That was what I was trying to avoid.

Just out of curiousity, are the -fno-rtti and -fno-exceptions modes part of any standard, or just something that GCC and a few other compilers implement?

P.S. as a former C++ user, thanks for all your work on libstdc++

Well, it's nice to ignore facts...

Posted Sep 4, 2011 21:08 UTC (Sun) by jwakely (subscriber, #60262) [Link]

Sorry for misunderstanding then. The C++ standard includes RTTI and exceptions as part of the language. They're not optional, so disabling them takes you into non-standard territory (but still reasonably portable, as -fno-rtti or -fno-exceptions and their equivalents are quite common non-standard features.)

There is (or was) an "Embedded C++" dialect which omits RTTI and exceptions, among other features, but it's not a standard and as Stroustrup has said "To the best of my knowledge EC++ is dead (2004), and if it isn't it ought to be."

That's the problem witrh GC

Posted Aug 21, 2011 17:55 UTC (Sun) by khim (subscriber, #9252) [Link] (32 responses)

I'm testing the application on Galaxy S. No choppiness is evident. According to logcat, GC pause lengths vary from < 10 ms to 37 ms, majority of the pauses being 20 ms.

That's the problem with GC. It works quite well in tests, but not so well in practice. What happens when your application works for half-hour and memory is badly fragmented? What happens when there are some other applications in background which also need to frequently run GC? This is where 2-4-8 2GHz cores will be helpful to mitigate GC disease. Eventually, when hardware is significantly more powerful GC-based will finally work as well as non-GC based original iPhone did back in 2007 with it's 412MHz CPU... Perhaps by then Apple will decide that it's Ok to give it to iOS developers too.

However, iOS version was much harder to write because objC was new to me, and xcode 4 is fairly buggy and crashes quite often, and then there's all that additional complexity around object allocation initialization, and autorelease pools and references that you need to explicitly care for.

And this was the whole point, right: Java makes it easy to write mediocre UI, but not so easy to write good UI. ObjectiveC and iOS tools in general are geared toward great UI but sometimes it's hard to write something which "just barely works". Which was my original point.

That's the problem witrh GC

Posted Aug 21, 2011 19:23 UTC (Sun) by HelloWorld (guest, #56129) [Link] (27 responses)

> That's the problem with GC. It works quite well in tests, but not so well in practice. What happens when your application works for half-hour and memory is badly fragmented?
Actually, fragmentation is usually less of an issue on garbage-collected systems, because the GC can defragment memory, which isn't feasible in languages like C where pointers aren't opaque.

> What happens when there are some other applications in background which also need to frequently run GC?
Why should that be a problem?

And THAT is the problem

Posted Aug 22, 2011 12:43 UTC (Mon) by khim (subscriber, #9252) [Link] (26 responses)

Actually, fragmentation is usually less of an issue on garbage-collected systems, because the GC can defragment memory, which isn't feasible in languages like C where pointers aren't opaque.

Right. And this is where you experience the most extreme dropouts and slowdowns. How will you compact a heap with multimegabyte arrays without significant delays?

And THAT is the problem

Posted Aug 22, 2011 15:04 UTC (Mon) by HelloWorld (guest, #56129) [Link] (25 responses)

> Right. And this is where you experience the most extreme dropouts and slowdowns.
Really? Do you have any data to back this up? Can you cite any measuremeants to that effect?

> How will you compact a heap with multimegabyte arrays without significant delays?
I don't know, but that doesn't mean it's not possible. I mean, people have been writing books about GCs (e. g. http://www.amazon.com/dp/0471941484/), do you really expect me to answer this kind of questions in an lwn comment?

And THAT is the problem

Posted Aug 22, 2011 17:00 UTC (Mon) by khim (subscriber, #9252) [Link] (24 responses)

> Right. And this is where you experience the most extreme dropouts and slowdowns.
Really? Do you have any data to back this up? Can you cite any measuremeants to that effect?

Ah, now we back to the whole BFS debate. No, I have no benchmarks present. I know how to make Java interface not "extremely sucky" but "kind-of-acceptable" - but yes, it's kind of black magic and I'm not 100% sure all techniques are actually required and proper. The main guide are dropout benchmarks for real programs: you just log timing of operations and tweak the architecture till they show acceptable timings. And I know I never need such black magic for simple, non-GC-driven programs: there I can measure timings without a lot of complicated experiments and be reasonably sure this will translate well to the end product. Not so with GC.

> How will you compact a heap with multimegabyte arrays without significant delays?
I don't know, but that doesn't mean it's not possible. I mean, people have been writing books about GCs (e. g. http://www.amazon.com/dp/0471941484/), do you really expect me to answer this kind of questions in an lwn comment?

Sure. It's typical problems for real programs. And if the best answer you can offer "there are a lot of papers on subject, surely CS wizards solved the problem long ago" then I'm not convinced.

Because these are the same "wizards from Ivory Tower" that proclaimed 20 years ago "Among the people who actually design operating systems, the debate is essentially over. Microkernels have won."

This was nice theory but practice is different. In practice two out of three surviving major OSes are microkernel-based only in name and one is not microkernel at all. I suspect the same will happen with GC: wizards promised that with GC you can just ignore memory issues and concentrate on what you need to do, but in practice it only works for packet computation (things like compilers or background indexers) while in UI you spend so much time fighting GC the whole savings become utterly pointless.

And THAT is the problem

Posted Aug 22, 2011 17:42 UTC (Mon) by cmccabe (guest, #60281) [Link] (2 responses)

> Sure. It's typical problems for real programs. And if the best answer you
> can offer "there are a lot of papers on subject, surely CS wizards solved
> the problem long ago" then I'm not convinced

Why don't you check out the incremental garbage collector that was implemented in Android 2.3? It exists and is deployed in the real world, not an ivory tower.

http://android.git.kernel.org/

And THAT is the problem

Posted Aug 25, 2011 13:30 UTC (Thu) by renox (guest, #23785) [Link] (1 responses)

incremental GC != real time GC.

To have no pause, you need a real time GC not merely an incremental GC!
And real time GCs, especially free one are rare indeed.

And THAT is the problem

Posted Sep 4, 2011 20:41 UTC (Sun) by cmccabe (guest, #60281) [Link]

Non sequitor. User interfaces have never been a hard real-time problem. Human reaction latencies are measured in hundreds of milliseconds.

And THAT is the problem

Posted Aug 22, 2011 18:06 UTC (Mon) by pboddie (guest, #50784) [Link] (13 responses)

I suspect the same will happen with GC: wizards promised that with GC you can just ignore memory issues and concentrate on what you need to do, but in practice it only works for packet computation (things like compilers or background indexers) while in UI you spend so much time fighting GC the whole savings become utterly pointless.

Someone gives you data and you just come back with more conjecture!

The good versus bad of GC is debated fairly actively in various communities. For example, CPython uses a reference-counting GC whose performance has been criticised from time to time by various parties for different reasons. As a consequence, implementations like PyPy have chosen different GC architectures. The developer of the HotPy implementation, who now appears to be interested in overhauling CPython, also advocates a generational GC, I believe, which means that there is some kind of emerging consensus.

There has even been quite a bit of work to measure the effect of garbage collection strategies on the general performance of virtual machines, and that has definitely fed into actual implementation decisions. This isn't a bunch of academics hypothesising, but actual real-world stuff.

And THAT is the problem

Posted Aug 22, 2011 19:38 UTC (Mon) by zlynx (guest, #2285) [Link] (12 responses)

Personal experience in programming user interactive software in Java is not conjecture. It is fact.

I can back that up with my own personal experience. Java software and C#/.NET too will show unexpected and very annoying pauses whenever the GC is required to run. C or C++ software, even Perl and Python software never demonstrated this erratic behavior.

If you would like to see it for yourself, please run JEdit on a system with 256MB RAM while editing several files of several megabytes each. That is one application I know I experienced problems with while Emacs and vi never acted funny.

And THAT is the problem

Posted Aug 22, 2011 22:40 UTC (Mon) by cmccabe (guest, #60281) [Link] (3 responses)

> I can back that up with my own personal experience. Java software and
> C#/.NET too will show unexpected and very annoying pauses whenever the GC
> is required to run.

The JVMs in use on servers do not use incremental garbage collection. I mentioned this is in the post that started this thread.

> C or C++ software, even Perl and Python software never
> demonstrated this erratic behavior.

CPython's garbage collector is based on reference counting, which is inherently incremental. So it's no surprise that you don't see long pauses there.

Perl's garbage collector is "usually" based on reference counting. It does a full mark and sweep when a thread shuts down, apparently. See http://perldoc.perl.org/perlobj.html#Two-Phased-Garbage-C...

> If you would like to see it for yourself, please run JEdit on a system
> with 256MB RAM while editing several files of several megabytes each.

I would like to, but unfortunately systems with 256MB of RAM are no longer manufactured or sold.

> That
> is one application I know I experienced problems with while Emacs and vi
> never acted funny.

Ironically, emacs actually implements its own garbage collector, which is based on mark-and-sweep. So apparently even on your ancient 256 MB machine, old-fashioned stop-the-world GC is fast enough that you don't notice it. As a side note, it's a little strange to see emacs held up as a shining example of efficient programming. The sarcastic joke in the past was that emacs stood for "eight megs and constantly swapping." Somehow that doesn't seem like such a good punchine any more, though. :)

The bottom line is that well-implemented garbage collection has its place on modern systems.

And THAT is the problem

Posted Aug 23, 2011 2:26 UTC (Tue) by viro (subscriber, #7872) [Link] (1 responses)

boot with mem=256M in kernel command line, assuming that it wasn't a snide "who the fsck cares about insufficiently 31137 boxen???"...

And THAT is the problem

Posted Aug 24, 2011 18:07 UTC (Wed) by cmccabe (guest, #60281) [Link]

It was just a snark. But thanks for the suggestion; it could be interesting for testing low-memory conditions.

There are nothing ironic here...

Posted Aug 24, 2011 9:07 UTC (Wed) by khim (subscriber, #9252) [Link]

Ironically, emacs actually implements its own garbage collector, which is based on mark-and-sweep. So apparently even on your ancient 256 MB machine, old-fashioned stop-the-world GC is fast enough that you don't notice it.

This machine may be ancient, but emacs is beyond ancient. You said it yourself:

As a side note, it's a little strange to see emacs held up as a shining example of efficient programming. The sarcastic joke in the past was that emacs stood for "eight megs and constantly swapping."

It's still good punchline - but now it can be used as showcase for GC. If you system is so beefy, is so overpowered, is so high-end that you can actually throw 10 times as much on the problem as it actually needs... then sure as hell GC is acceptable. But this is not what GC proponents will tell you, right?

And THAT is the problem

Posted Aug 23, 2011 5:55 UTC (Tue) by alankila (guest, #47141) [Link] (3 responses)

Note that even in java 6, there are multiple GC strategies to pick from today. Things change -- what was true in 2000 or whenever you conducted your tests (based on 256 MB of memory) isn't true today. And we don't know if this was because jEdit was configured with so much heap that it swapped out of your RAM or whatever...

I do know however that even the mark-and-sweep collectors in practice limit pauses to less than 100 ms because I have written audio applications in java with shorter buffers than 100 ms and they run without glitching. This sort of application should have its heap size tuned appropriately because too large heap will have lot of objects to collect when the cycle triggers, and this in turn can cause glitches, whereas a small heap will have frequent but fast enough collections.

G1GC strategy looks very promising, because it concentrates GC effort to memory regions that are likely to be free with very little work, and supports soft realtime constraints for limiting the length of a single GC cycle. It looks to be something to the tune of order of magnitude faster than the other strategies, but I haven't personally tried it yet.

This is my experience as well...

Posted Aug 24, 2011 9:00 UTC (Wed) by khim (subscriber, #9252) [Link] (2 responses)

This sort of application should have its heap size tuned appropriately because too large heap will have lot of objects to collect when the cycle triggers, and this in turn can cause glitches, whereas a small heap will have frequent but fast enough collections.

Yeah. It works. But the main stated goal of the GC, it's raison d'être is the ability to forget about memory allocation. Remember: no memory leaks, no more confusion about ownership, etc? GC pseudoscience fails to deliver, sorry. Just like relational database theory fails to deliver. It does not mean both are useless - if your goal is "something working" they often are "good enough". But the problem with Java is the fact that GC is imposed. You can not avoid it because standard library requires it. So in the end you fight to the death with the one thing which should "free you" from some imaginary tyranny.

This is my experience as well...

Posted Aug 25, 2011 10:37 UTC (Thu) by alankila (guest, #47141) [Link]

Your arguments sound really bizarre to me.

Think about what I just said: I have positive experience working with a relatively low-latency application in the real world. To get it, all I have to do is adjust one tunable -- the heap size. And I hinted that with G1GC even that adjustment is now unnecessary, but I'll wait until G1GC is actually the default. JDK7 maybe, when it rolls out for OS X I'll probably check it out.

This is my experience as well...

Posted Aug 31, 2011 14:37 UTC (Wed) by nix (subscriber, #2304) [Link]

So rather than agonizing over object ownership, we tweak one tunable, the heap size, yet the existence of this single tunable apparently makes GCs 'pseudoscience'?

Hint: GCs are actually studied, quite intensively, by actual computer scientists. Science is what scientists do: thus, GC research is science. Enough of the badmouthing. (It is quite evident to me at least that no amount of evidence will change your opinion on this score: further discussion is clearly pointless.)

And THAT is the problem

Posted Aug 23, 2011 11:00 UTC (Tue) by pboddie (guest, #50784) [Link] (3 responses)

I can back that up with my own personal experience. Java software and C#/.NET too will show unexpected and very annoying pauses whenever the GC is required to run. C or C++ software, even Perl and Python software never demonstrated this erratic behavior.

Yet Perl and Python employ garbage collection. My point was that blanket statements about GC extrapolated from specific implementations of specific platforms don't inform the discussion, but I'm pretty sure that this is a replay of a previous discussion, anyway, fuelled by the same prejudices, of course.

Puhlease...

Posted Aug 24, 2011 8:52 UTC (Wed) by khim (subscriber, #9252) [Link] (2 responses)

1. Perl and python are unbearably slow and laggy when used by itself. Thankfully noone in the right mind will ever try to use then without wide array of support C libraries. These "fast core" libraries don't employ GC.

2. Most perl and python scripts are batch-mode scripts. GC is perfectly Ok for such use (when you only care about throughput and not about latency).

3. Current implementations of perl and python use refcounting GC which is, of course, it as reliable WRT latency as manual memory allocation. We'll see what happens when PyPy and other "advanced" implementations will become mainstream.

Puhlease...

Posted Aug 25, 2011 10:55 UTC (Thu) by alankila (guest, #47141) [Link] (1 responses)

1. The language speed is due to their interpreting loop rather than the fact they use GC. As a rule of thumb, an interpreter evaluating an opcode stream appears to be 10 times slower than compiler that translates it into native form.

2. Probably true. Missing from this discussion is the observation that even malloc/free can be unpredictable with respect to latency because they must maintain the free list and occasionally optimize it to maintain performance of allocations. No doubt advances to malloc technology have happened and will happen, and there are multiple implementations to pick from that expose different tradeoffs.

3. Python, I've been told, also contains a true GC in addition to the refcounter. I think the largest single advantage of a refcounter is that it gives a very predictable lifecycle for an object, often removing it as soon as the code exits a block. This makes user code simpler to write because filehandles don't need to be closed and database statement handles go away automatically. Still, this is synchronous object destruction, and scheduling object destruction during user think-time would give better user experience.

Puhlease...

Posted Aug 25, 2011 11:41 UTC (Thu) by pboddie (guest, #50784) [Link]

1. The language speed is due to their interpreting loop rather than the fact they use GC. As a rule of thumb, an interpreter evaluating an opcode stream appears to be 10 times slower than compiler that translates it into native form.

Indeed. One can switch off GC and just let programs allocate memory until they exit to see the performance impact of GC, if one is really interested in knowing what that is. This is something people seem to try only infrequently and in pathological cases - it's not a quick speed-up trick.

2. Probably true. Missing from this discussion is the observation that even malloc/free can be unpredictable with respect to latency because they must maintain the free list and occasionally optimize it to maintain performance of allocations. No doubt advances to malloc technology have happened and will happen, and there are multiple implementations to pick from that expose different tradeoffs.

Quite right. This is not so different from any discussions of latency around garbage collectors.

3. Python, I've been told, also contains a true GC in addition to the refcounter.

CPython uses reference counting and a cycle detector. PyPy uses a generational GC by default, if I remember correctly. The PyPy people did spend time evaluating different GCs and found that performance was significantly improved for some over others.

I think the largest single advantage of a refcounter is that it gives a very predictable lifecycle for an object, often removing it as soon as the code exits a block.

This advantage is almost sacred to the CPython developers, but I don't think it is entirely without its own complications. Since we're apparently obsessed with latency now, I would also note that a reference counting GC is also unlikely to be unproblematic with regard to latency purely because you can have a cascade of unwanted objects and the GC would then need to be interruptable so that deallocation work could be scheduled at convenient times. This could be done (and most likely is done) in many different kinds of GC.

And THAT is the problem

Posted Aug 22, 2011 19:39 UTC (Mon) by HelloWorld (guest, #56129) [Link] (3 responses)

> Ah, now we back to the whole BFS debate. No, I have no benchmarks present.
Well, then you don't really have a point, do you?

> Sure. It's typical problems for real programs.
Again, do you have any data to back this up? Because I really doubt that this is a problem for 99% of all applications.

> Because these are the same "wizards from Ivory Tower" that proclaimed 20 years ago "Among the people who actually design operating systems, the debate is essentially over. Microkernels have won."

> This was nice theory but practice is different. In practice two out of three surviving major OSes are microkernel-based only in name and one is not microkernel at all.
That doesn't mean anything as long as you don't show that the failure of microkernel based OSes on the desktop can be attributed directly to the microkernel design. There may well have been other factors: inertia, lack of hardware and software vendors, bad luck etc.. Also, microkernels were actually a success in the embedded world. QNX is just one example of this.

And THAT is the problem

Posted Aug 23, 2011 5:00 UTC (Tue) by raven667 (subscriber, #5198) [Link] (2 responses)

Microkernels are also a success in the modern server world, the name has changed but the design is there, it is just called a hypervisor now

And THAT is the problem

Posted Aug 23, 2011 17:19 UTC (Tue) by bronson (subscriber, #4806) [Link] (1 responses)

I don't think this is true. Microkernels are a collection of OS processes interacting to provide a rich interface to userland (file systems, networking stack, IPC, etc), while hypervisors don't intend to provide OS services at all. They're just a thin hardware abstraction layer so that multiple operating systems (microkernel or not) can be fooled into coexisting and maybe some very coarse resource management.

Like atoms vs. the solar system, they might look similar if you look from afar. It's certainly possible to cherry-pick theoretical similarities. For real-world work, though, they tend to be quite different.

And THAT is the problem

Posted Aug 25, 2011 20:37 UTC (Thu) by raven667 (subscriber, #5198) [Link]

I was thinking that the combination of the hypervisor and the guest kernel in total formed the microkernel. Each guest has its own personality and file systems, network, ipc, etc. Information is passed back and forth through the hypervisor using interfaces that happen to look a lot like network cards and block devices, or use PV drivers which can do pure message passing. The big thing is that every bit runs in its own protected memory space which I thought was the big difference and value of a microkernel vs. a traditional kernel.

Like you said, I'm certainly standing from afar and squinting more than a little. 8-)

And THAT is the problem

Posted Aug 23, 2011 23:36 UTC (Tue) by rodgerd (guest, #58896) [Link] (2 responses)

You mean like the microkernels used in MacOS, the iPhone that you're touting as the model of how to do things, and Windows?

Boy microkernels didn't get anywhere, did they.

And THAT is the problem

Posted Aug 24, 2011 0:12 UTC (Wed) by rahulsundaram (subscriber, #21946) [Link]

MacOS and Windows do not claim to be micro kernels anymore and they are not. The only widespread use of micro kernel that I am aware of is QNX.

And THAT is the problem

Posted Aug 24, 2011 7:44 UTC (Wed) by anselm (subscriber, #2796) [Link]

MacOS may be based on the Mach microkernel, but given that it has a big monolithic BSD emulation layer and nothing else on top it can by no stretch of the imagination be called a »microkernel OS«. (Andrew Tanenbaum probably wouldn't like it any more than he likes Linux.) Very similar considerations apply to Windows; having the graphics driver in the kernel is not what one would expect in a microkernel OS.

It is fair to say that the microkernel concept, while academically interesting, has so far mostly failed to stand up to the exigencies of practical application. There are exceptions (QNX comes to mind), but despite previous claims to the contrary no mainstream operating system would, in fact, pass as a »microkernel OS«. At least Linux is honest about it :^)

That's the problem witrh GC

Posted Aug 22, 2011 3:35 UTC (Mon) by alankila (guest, #47141) [Link] (3 responses)

A question of finesse:

Does objC run object releases in batches after an iteration in the main loop has executed, or does it release them synchronously when I type [foo release]? One of the supposed advantages of GC over malloc/free style management is that the VM can attempt to arrange GC to occur asynchronously with respect to other work being done. I think Dalvik does something like this.

Dalvik does not have a compacting GC last time I heard, so heap fragmentation could in theory result in more unusable memory and therefore more common GC cycles. There's nothing I can do about heap fragmentation on either iOS or Android, so I don't worry about it.

That's the problem witrh GC

Posted Aug 22, 2011 16:37 UTC (Mon) by endecotp (guest, #36428) [Link] (2 responses)

> Does objC run object releases in batches after an iteration
> in the main loop has executed, or does it release them
> synchronously when I type [foo release]?

If you [foo release], it releases it synchronously. If you [foo autorelease], it releases it later.

> One of the supposed advantages of GC over malloc/free style
> management is that the VM can attempt to arrange GC to occur
> asynchronously with respect to other work being done.

Right. I have never heard anyone claim that objC's autorelease is faster than synchronous release, however. My guess is that any benefit of postponing the actual release is offset by the effort needed to add it to the autorelease pool in the first place.

The most successful example of this sort of "postponed release" that I've seen is Apache's memory pools. Apache manages per-connection and per-request memory pools from which allocations can be made contiguously, with no tracking overhead. These allocations don't need to be individually freed, but rather the whole pool is freed at the end of the request or connection. I am surprised that this sort of thing is not done more often, as it should have both performance and ease-of-coding benefits.

That's the problem witrh GC

Posted Aug 22, 2011 19:31 UTC (Mon) by kleptog (subscriber, #1183) [Link] (1 responses)

Memory pools are also used in PostgreSQL. They have the fantastic property of being able to say 'this transaction is aborted, forget everything related to it in almost constant time'. They're useful when dealing with user code because you can arrange for the code the work in a particular context and when the code is done, you simple clear the context in one go.

It's not free though. It does mean that if you want data to survive for longer periods it means you need to copy the data to a new context. It means that for functions the context of associated memory becomes part of the API and you need to be careful that people respect the conventions, or you get dangling pointers easily. And valgrind gets confused. And external libraries don't get along well with it some times. And you don't get destructors (although I understand Samba has a memory pool architecture with destructors).

But you never get memory leaks, which is good for reliability. And that makes up for a lot.

That's the problem witrh GC

Posted Aug 23, 2011 10:59 UTC (Tue) by jwakely (subscriber, #60262) [Link]

Yeah, memory pools are great. In a previous life I grafted C++ destructors onto Apache memory pools, IIRC by overloading 'new' to register functions to be run when the pool gets torn down.

Lack of sales/interest due to slow hardware?

Posted Aug 20, 2011 3:30 UTC (Sat) by pr1268 (guest, #24648) [Link] (1 responses)

> WebOS ran slow and power-hungry due to being based around Javascript,
> a language that was never designed for high efficiency.

Be that as it may, it still doesn't explain the speed difference of WebOS on the two different hardware platforms.

Unless the iPhone2 has some JS speed booster/optimizer and WebOS runs on top of this interpreter, I still don't see how "twice as fast" could be realized in software alone. I do agree that JS seems like a poor choice on which to base an entire OS.

Lack of sales/interest due to slow hardware?

Posted Aug 20, 2011 15:53 UTC (Sat) by cmccabe (guest, #60281) [Link]

There is nothing to really explain, because we don't know what the speed difference was or how it was measured.

For example, WebOS uses the V8 Javascript engine, whereas iOS uses Nitro. As far as I know, V8 is still the faster of the two. So I'm sure you could come up with some benchmark like loading a complicated web page where webOS wins.