|
|
Subscribe / Log in / New account

Memory leaks in managed languages

Memory leaks in managed languages

Posted May 3, 2009 0:02 UTC (Sun) by JoeBuck (subscriber, #2330)
In reply to: Great news by sbergman27
Parent article: Tomboy, Gnote, and the limits of forks

Languages that use garbage collection are free from one source of memory leaks; they can collect an object that is no longer reachable by any pointer or reference. However, if the programmer uses this feature to get sloppy, he/she often winds up with ever-growing memory because objects that are no longer needed are still linked into a massive, confusing data structure, and the result can be that the program grows and grows, failing to free unneeded storage, just like a leaky C++ program.

On the other hand, I verified with valgrind that gnote is a leaky C++ program. The developer might want to check that; just create and delete a couple of notes, and there are leaks.


to post comments

Memory leaks in managed languages

Posted May 3, 2009 1:28 UTC (Sun) by sbergman27 (guest, #10767) [Link] (21 responses)

"""
On the other hand, I verified with valgrind that gnote is a leaky C++ program.
"""

Well, maybe this is a leak, and maybe its just horrific memory usage. But I just installed Tomboy. Started it. Cleared the "start here" default note, and restarted Tomboy. And added 10 notes, one by one. Each one simply said "testing".

Shared memory started at 19M and remained constant throughout the test.

RSS was as follows:

Note# RSS
0 32M
1 43M
2 50M
3 57M
4 63M
5 69M
6 75M
7 82M
8 88M
9 94M
10 101M

Incredulous, I added another 10. At the end of the 20th note, RSS was up to 168M! Tomboy starts out using about as much memory as Firefox on start, and then sucks up 7MB for each instance of storing the word "testing" in a note. I find this incredible. Far worse then I had thought previously. 168M to store 140 bytes. Maybe its the unicode...

Could you please run that test with Gnote and post the results? I'm curious how it compares.

Memory leaks in managed languages

Posted May 3, 2009 4:07 UTC (Sun) by JoeBuck (subscriber, #2330) [Link] (8 responses)

It looks quite similar; as reported by top, the virtual memory size for gnote rises to 110m with ten notes displayed. My guess is that the memory is mostly used by the gtk+ objects, which are probably just the same for the two programs.

The leaks I mentioned don't seem to be very large.

By contrast, an instance of Emacs visiting ten files is only 18m. Yet people still claim that Emacs is a bloated program.

Memory leaks in managed languages

Posted May 3, 2009 4:41 UTC (Sun) by sbergman27 (guest, #10767) [Link] (6 responses)

"""
the virtual memory size for gnote rises to 110m with ten notes displayed
"""

Virtual memory size? That's completely useless. What is the RSS size? (Strictly speaking you should turn off swap to take that measurement, but if you have plenty of memory it shouldn't make much, if any, difference.

Just for reference, the virtual memory size for Tomboy starts out at 343M and increases to 439M with 10 notes.

Memory leaks in managed languages

Posted May 3, 2009 6:10 UTC (Sun) by JoeBuck (subscriber, #2330) [Link] (5 responses)

I get 53M for RSS with ten notes displayed.

Memory leaks in managed languages

Posted May 3, 2009 12:55 UTC (Sun) by sbergman27 (guest, #10767) [Link] (4 responses)

So about half of what Tomboy uses in that scenario. You may already know all this, but RSS tells us how much actual physical ram is being used by the process. Virtual (and someone please correct me if I am inaccurate) is just how much memory the process has requested. Since Linux does lazy allocation of memory, and most apps don't actually touch nearly as much memory as they request, the virtual number is pretty meaningless. If you sort by size, descending, and then start adding up the virtual column, you'll likely quickly reach a point where total virtual far exceeds the sum of your ram and swap. It's pretty much a fiction.

To get the "real" unique memory usage for the process, you can (with the swap turned off) subtract the shared column from the rss column. The smem utility, featured a few days ago here on LWN, can show other interesting things, like proportional set size.

Sometimes I think that the authors of "top" had a special conference to determine how to present memory information in the most misleading way possible, and then wrote "top" based upon the results. Many people do think that the virtual size actually means something. And most new users seem to think that Linux is a sort of monster black hole of memory leakage, since no matter what they are doing, it sucks up all available memory until there is almost no free memory left. Htop is, at least, better.

Memory leaks in managed languages

Posted May 3, 2009 17:12 UTC (Sun) by hppnq (guest, #14462) [Link] (3 responses)

The virtual memory usage as reported by top and ps includes shared libraries, among other things, which is why the numbers do not add up to a necessarily meaningful number. No conspiracy here. ;-)

Accounting memory usage

Posted May 3, 2009 18:38 UTC (Sun) by arjan (subscriber, #36785) [Link] (2 responses)

http://repo.moblin.org/moblin/development/core/source/mem...

memuse is a small simple app that does a reasonably good job of showing the "real" memory cost of an application; it accounts the cost of a shared library as divided by the number of users etc etc....

Accounting memory usage

Posted May 4, 2009 8:13 UTC (Mon) by nhippi (subscriber, #34640) [Link]

After finding out howto rescue this from a .src.rpm and build it, some comments. Per app memory consumption appears to be effectively:

awk '/Pss:/{sum+=$2}END{print sum "kB"}' /proc/$pid/smaps

This makes one wonder why ps/top doesn't support showing Pss themself.

The library section of output appears to be quite short, not showing for example libc or libX11.

Accounting memory usage

Posted May 10, 2009 23:26 UTC (Sun) by vonbrand (subscriber, #4458) [Link]

Trouble with that idea is that it depends sensitively on the other stuff that is running at the moment. E.g., having no or 20 processes also using glibc at the moment makes quite a difference "on average".

Memory leaks in managed languages

Posted May 4, 2009 4:19 UTC (Mon) by jordanb (guest, #45668) [Link]

Emacs quit being a bloated program the day Eclipse was released.

That may pose some sort of correlation to Wirth's Law: if you software is considered bloated, all you have to do is wait a bit and someone will come along and blow you away by an order of magnitude.

Memory leaks in managed languages

Posted May 3, 2009 17:59 UTC (Sun) by drag (guest, #31333) [Link] (3 responses)

I am seeing the same thing that you are with Tomboy.

About 5-10 megs per note. The memory usage doesn't go away when you close the notes out. If you restart it then it's fine.

the only thing that I can think of is if it's not a bug (which it does seem very much like a memory leak) is that we both have excessive amounts of RAM and the garbage collecting for Mono isn't getting triggered to reclaim those objects. But I don't know Mono or C# enough to really know what is going on.

It does seem very horrible.

Memory leaks are quite possible with languages like Python or C#...

At least in python you are not allowed to destroy objects in memory directly. It's possible, but it would require wizardry. So what happens is that when you create a object and reference it to a variable name it will stay in existance as long as it has a single reference. Usually people create variables in loops and functions so that the scope of variable is short-lived and the object loses it's reference and is ready to be garbage collecting.

However if you are creating lots of object in a dynamic manner and your using them in a global scope or long-running loop (like the main loop) or whatever then you need to be careful that when your finished with them that either you void out the reference or 'del' the reference. Otherwise they'll just build up and that is were you get your leak.

Also, for good performance, you generally try to avoid invoking the garbage collector as much as possible. So you try to reuse objects or reduce the amounts of objects you create. If your always creating objects all over the place then memory usage will go up and in order to counter that the garbage collection system will be invoked rather often. In long-running programs garbage collecting can take up a singificant amount of processor time if your not careful.

That sort of thing.

But it's deceiving because normally people don't think like that in Python or whatnot. That sort of stuff is usually taken care of automatically so niave programmers will get burned by it.

High level languages are certainly no substitute for skill or knowledge... I think that understanding things like C and memory management is going to be required for the best effective programming in C# or Python or whatever. To get the best out of something like Python you really do need to understand what is happenning under the skin... I don't think that Mono would be any different.

Memory leaks in managed languages

Posted May 3, 2009 19:33 UTC (Sun) by sbergman27 (guest, #10767) [Link] (2 responses)

For comparison, Gnome's standard sticky notes applet starts at 14M of rss with 10M shared, for 4MB net consumption. After creating 10 notes, that increases to 17M rss and 11M shared, which represents 6MB net consumption. And even for the first note, the response is so close to instant that I could not time it even with a stopwatch. My eyes cannot even detect any delay. It does seem, currently, to lack search capability, which I can't imagine being a very expensive or difficult thing to implement. And it seems to me that would bring it to feature completeness for the vast majority of people.

Meanwhile, Tomboy has either a bug or a "feature" which causes it to suck up 32M for doing nothing. And another 7M per note for doing the one trivially simple thing that a sticky note program is supposed to do.

Furthermore, if it is a "feature". i.e. if it is just avoiding garbage collection for performance purposes, as you suggest... well, wasn't one of the central points of this article that Tomboy was annoyingly slow? And the whole idea of "performance optimization" for a trivial sticky note app is absolutely ludicrous in the first place.

If it's a memory leak, then after however many years Tomboy has been around, there is an unbelievably huge and gaping memory leak in the code path for the one very simple thing that a sticky note app is supposed to do: create sticky notes.

Tomboy may or may not be representative of Mono's power. But it sure as hell isn't looking like a very good demonstration of Mono's amazing coolness.

Memory leaks in managed languages

Posted May 3, 2009 19:56 UTC (Sun) by drag (guest, #31333) [Link] (1 responses)

Well a note taking application is one of those things that is deceptively simple. Thinks like the ability to share or syncronize notes, automatic search indexing, automatic keyword association, etc etc.

To make it very useful it needs to look and behave simply as far as the UI is concerned, but it's quite a complex task.

But ya things are not looking good for the Mono.

Memory leaks in managed languages

Posted May 4, 2009 12:25 UTC (Mon) by johill (subscriber, #25196) [Link]

On the topic not looking good for mono, I can't believe nobody mentioned this long-standing bug yet: https://bugzilla.novell.com/show_bug.cgi?id=379602

Although now fixed, it will take a very long time to trickle down to distros and makes mono suck a lot of power.

Memory leaks in managed languages

Posted May 7, 2009 10:05 UTC (Thu) by epa (subscriber, #39769) [Link] (7 responses)

So you're saying that for each note containing the word 'testing', it uses seven cents of memory? (A quick check of newegg.com shows that RAM currently costs about ten dollars per gigabyte.)

Memory leaks in managed languages

Posted May 7, 2009 20:47 UTC (Thu) by jmorris42 (guest, #2203) [Link] (6 responses)

> So you're saying that for each note containing the word 'testing',
> it uses seven cents of memory?

And it is people like you that got us into the situation where a five year old PC can't run a current Linux distro well. So much for being a viable option for old, obsolete low spec hardware, and there goes the 3rd world.

And it is people like you that have probably, in the end, cost us the netbook and smartphone battle. You see, in some use cases it isn't a question of the cost of more RAM, it is the power consumption. Keeping gigs of RAM alive in a smartphone isn't currently possible and it kills battery life like crazy in a netbook.

Time for unpleasant truth time. Put (your favorite linux distro) and XP on a netbook. Which one boots faster? Which one logs in faster? Which one posts the start menu faster (with icons). Odds are XP will win one or more of those races. We got fat dumb and stupid, thats what happened. It wasn't always like this, I remember when Linux was lean and fast. To the people who wanted to make Linux 'just like Windows' all I have to say is Congratulations, you succeeded.

Now imagine that we would have had the sense years ago to look at a clock applet chewing through more RAM than you needed to run a complete UNIX environment in and said , "That is insane!" I can promise you Linksys wouldn't have had to toss Linux off their WRT line for being too fat. We would totally OWN the netbook and smartphone space.

Memory leaks in managed languages

Posted May 7, 2009 22:54 UTC (Thu) by nix (subscriber, #2304) [Link] (1 responses)

I considered the 'seven cents of memory' to be a *bad* thing. One cent per
letter? A hundred words costs ten dollars?!

(And that's non-ECCRAM costs. If you care about your data, with modern RAM
volumes, you really should get ECCRAM, even if it is slower.)

Memory leaks in managed languages

Posted May 8, 2009 17:14 UTC (Fri) by epa (subscriber, #39769) [Link]

A hundred words costs ten dollars?!
Obviously not - I doubt the memory usage for a hundred-word note would be significantly more than a note saying 'hello'. It would still use about seven cents' worth of memory, or perhaps ten cents.

Memory leaks in managed languages

Posted May 7, 2009 23:08 UTC (Thu) by dlang (guest, #313) [Link]

to be fair, it's not linux that has gotten fat, it's the desktop environment.

it's still possible to run linux on low-spec systems, but you aren't going to run Gnome or KDE (or openoffice or firefox) and be at all happy.

if you have a server-type application that doesn't need this, or are willing to do your own GUI stuff (a media box for example), linux can still do the job just fine.

but if you try to install Red Hat, Suse or Ubuntu on the system and start from there you have two strikes against you when you start (Debian isn't as bad right now, but it's also not nearly as good as it used to be)

it's not just that packages are larger than they used to be, it's that many more things have been declared 'essential' that didn't used to be.

Memory leaks in managed languages

Posted May 8, 2009 17:27 UTC (Fri) by epa (subscriber, #39769) [Link] (2 responses)

Even a netbook these days comes with 512 megs of RAM as standard, and happily runs big applications such as Firefox and Openoffice (they are both installed as standard on my year-old Acer). I appreciate the sentiment you express that everything has gotten slower and bloatier, but really, this is a standard old-timer's complaint that was just as valid in 1999, 1989 or even 1979 as it is now. If there is a need to make the system faster and less power-hungry, then as with optimizing any software, it's best done by profiling to find the high-priority fixes, not by microoptimizing based on guesses or moral outrage about a particular app's memory usage.
Now imagine that we would have had the sense years ago to look at a clock applet chewing through more RAM than you needed to run a complete UNIX environment in and said , "That is insane!"
This was indeed my reaction many years ago when I first saw that xclock used over three megabytes of memory. Yet somehow the trusty old xclock is now considered lean and mean by today's standards.

Memory leaks in managed languages

Posted May 11, 2009 9:36 UTC (Mon) by etienne_lorrain@yahoo.fr (guest, #38022) [Link] (1 responses)

The problem of the argument "it doesn't matter how much RAM we use, it is cheap" is that it doesn't take into account the memory cache.
Cache memory is expensive - that is why processor price vary so much.
In simple words, if a piece of code is completely in the first level cache (code and data), you can consider it done; if it is in the second level cache, it will be done really soon; if it is elsewhere, you will need to wait.
That is independant of the micro-optimisations like number of assembly instructions on a function, but it means you have to take care of the "locality" principle which is the memory cache design principle.
Basically, to write good code, you should not page-in (into the cache) more than what you need, by correctly organising your data (all you will need will be on the same cache lines) and by correctly inlining functions.
Also, to be a "good citizen", you should not completely trash the memory cache just to display the time...

Memory leaks in managed languages

Posted May 21, 2009 17:08 UTC (Thu) by epa (subscriber, #39769) [Link]

You are right, an app that uses more memory will run slower. But there is no indication that Tomboy runs too slowly on any reasonable hardware (apart from startup time, which is a noticeable three or four seconds). After all its speed is limited by user interaction. Again, it comes down to practical considerations, and not some moral issue.

Memory leaks in managed languages

Posted May 3, 2009 16:58 UTC (Sun) by alankila (guest, #47141) [Link]

To be on the pedantic side, the collector should also be a compacting collector, because otherwise empty space tends to accumulate between long-lived objects. This in turn tends to slowly but surely increase application's memory usage over time.

Mono has a compacting collector, last time I looked, but it still wasn't used by default due to some problems that seem esoteric enough to not warrant going into here...


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds