Memory leaks in managed languages
Memory leaks in managed languages
Posted May 3, 2009 0:02 UTC (Sun) by JoeBuck (subscriber, #2330)In reply to: Great news by sbergman27
Parent article: Tomboy, Gnote, and the limits of forks
Languages that use garbage collection are free from one source of memory leaks; they can collect an object that is no longer reachable by any pointer or reference. However, if the programmer uses this feature to get sloppy, he/she often winds up with ever-growing memory because objects that are no longer needed are still linked into a massive, confusing data structure, and the result can be that the program grows and grows, failing to free unneeded storage, just like a leaky C++ program.
On the other hand, I verified with valgrind that gnote is a leaky C++ program. The developer might want to check that; just create and delete a couple of notes, and there are leaks.
      Posted May 3, 2009 1:28 UTC (Sun)
                               by sbergman27 (guest, #10767)
                              [Link] (21 responses)
       
Well, maybe this is a leak, and maybe its just horrific memory usage. But I just installed Tomboy. Started it. Cleared the "start here" default note, and restarted Tomboy. And added 10 notes, one by one. Each one simply said "testing". 
Shared memory started at 19M and remained constant throughout the test. 
RSS was as follows: 
Note# RSS 
Incredulous, I added another 10. At the end of the 20th note, RSS was up to 168M! Tomboy starts out using about as much memory as Firefox on start, and then sucks up 7MB for each instance of storing the word "testing" in a note. I find this incredible. Far worse then I had thought previously. 168M to store 140 bytes. Maybe its the unicode... 
Could you please run that test with Gnote and post the results?  I'm curious how it compares. 
 
     
    
      Posted May 3, 2009 4:07 UTC (Sun)
                               by JoeBuck (subscriber, #2330)
                              [Link] (8 responses)
       
The leaks I mentioned don't seem to be very large.
 
By contrast, an instance of Emacs visiting ten files is only 18m.  Yet people still claim that Emacs is a bloated program.
      
           
     
    
      Posted May 3, 2009 4:41 UTC (Sun)
                               by sbergman27 (guest, #10767)
                              [Link] (6 responses)
       
Virtual memory size? That's completely useless. What is the RSS size? (Strictly speaking you should turn off swap to take that measurement, but if you have plenty of memory it shouldn't make much, if any, difference. 
Just for reference, the virtual memory size for Tomboy starts out at 343M and increases to 439M with 10 notes. 
 
     
    
      Posted May 3, 2009 6:10 UTC (Sun)
                               by JoeBuck (subscriber, #2330)
                              [Link] (5 responses)
       
     
    
      Posted May 3, 2009 12:55 UTC (Sun)
                               by sbergman27 (guest, #10767)
                              [Link] (4 responses)
       
To get the "real" unique memory usage for the process, you can (with the swap turned off) subtract the shared column from the rss column. The smem utility, featured a few days ago here on LWN, can show other interesting things, like proportional set size. 
Sometimes I think that the authors of "top" had a special conference to determine how to present memory information in the most misleading way possible, and then wrote "top" based upon the results. Many people do think that the virtual size actually means something. And most new users seem to think that Linux is a sort of monster black hole of memory leakage, since no matter what they are doing, it sucks up all available memory until there is almost no free memory left. Htop is, at least, better. 
     
    
      Posted May 3, 2009 17:12 UTC (Sun)
                               by hppnq (guest, #14462)
                              [Link] (3 responses)
       
     
    
      Posted May 3, 2009 18:38 UTC (Sun)
                               by arjan (subscriber, #36785)
                              [Link] (2 responses)
       
memuse is a small simple app that does a reasonably good job of showing the "real" memory cost of an application; it accounts the cost of a shared library as divided by the number of users etc etc.... 
 
     
    
      Posted May 4, 2009 8:13 UTC (Mon)
                               by nhippi (subscriber, #34640)
                              [Link] 
       
awk '/Pss:/{sum+=$2}END{print sum "kB"}' /proc/$pid/smaps 
This makes one wonder why ps/top doesn't support showing Pss themself. 
The library section of output appears to be quite short, not showing for example libc or libX11.  
     
      Posted May 10, 2009 23:26 UTC (Sun)
                               by vonbrand (subscriber, #4458)
                              [Link] 
       
Trouble with that idea is that it depends sensitively on the other stuff that is running at the moment. E.g., having no or 20 processes also using  
     
      Posted May 4, 2009 4:19 UTC (Mon)
                               by jordanb (guest, #45668)
                              [Link] 
       
That may pose some sort of correlation to Wirth's Law: if you software is considered bloated, all you have to do is wait a bit and someone will come along and blow you away by an order of magnitude. 
     
      Posted May 3, 2009 17:59 UTC (Sun)
                               by drag (guest, #31333)
                              [Link] (3 responses)
       
About 5-10 megs per note. The memory usage doesn't go away when you close the notes out. If you restart it then it's fine. 
the only thing that I can think of is if it's not a bug (which it does seem very much like a memory leak) is that we both have excessive amounts of RAM and the garbage collecting for Mono isn't getting triggered to reclaim those objects.  But I don't know Mono or C# enough to really know what is going on.  
It does seem very horrible. 
 
Memory leaks are quite possible with languages like Python or C#... 
At least in python you are not allowed to destroy objects in memory directly. It's possible, but it would require wizardry.  So what happens is that when you create a object and reference it to a variable name it will stay in existance as long as it has a single reference. Usually people create variables in loops and functions so that the scope of variable is short-lived and the object loses it's reference and is ready to be garbage collecting.  
However if you are creating lots of object in a dynamic manner and your using them in a global scope or long-running loop (like the main loop) or whatever then you need to be careful that when your finished with them that either you void out the reference or 'del' the reference.  Otherwise they'll just build up and that is were you get your leak.  
Also, for good performance, you generally try to avoid invoking the garbage collector as much as possible. So you try to reuse objects or reduce the amounts of objects you create. If your always creating objects all over the place then memory usage will go up and in order to counter that the garbage collection system will be invoked rather often. In long-running programs garbage collecting can take up a singificant amount of processor time if your not careful. 
That sort of thing.  
But it's deceiving because normally people don't think like that in Python or whatnot. That sort of stuff is usually taken care of automatically so niave programmers will get burned by it. 
High level languages are certainly no substitute for skill or knowledge... I think that understanding things like C and memory management is going to be required for the best effective programming in C# or Python or whatever. To get the best out of something like Python you really do need to understand what is happenning under the skin... I don't think that Mono would be any different. 
 
     
    
      Posted May 3, 2009 19:33 UTC (Sun)
                               by sbergman27 (guest, #10767)
                              [Link] (2 responses)
       
 Meanwhile, Tomboy has either a bug or a "feature" which causes it to suck up 32M for doing nothing. And another 7M per note for doing the one trivially simple thing that a sticky note program is supposed to do. 
Furthermore, if it is a "feature". i.e. if it is just avoiding garbage collection for performance purposes, as you suggest... well, wasn't one of the central points of this article that Tomboy was annoyingly slow? And the whole idea of "performance optimization" for a trivial sticky note app is absolutely ludicrous in the first place. 
If it's a memory leak, then after however many years Tomboy has been around, there is an  unbelievably huge and gaping memory leak in the code path for the one very simple thing that a sticky note app is supposed to do: create sticky notes. 
Tomboy may or may not be representative of Mono's power. But it sure as hell isn't looking like a very good demonstration of Mono's amazing coolness. 
 
     
    
      Posted May 3, 2009 19:56 UTC (Sun)
                               by drag (guest, #31333)
                              [Link] (1 responses)
       
To make it very useful it needs to look and behave simply as far as the UI is concerned, but it's quite a complex task.  
But ya things are not looking good for the Mono. 
     
    
      Posted May 4, 2009 12:25 UTC (Mon)
                               by johill (subscriber, #25196)
                              [Link] 
       
Although now fixed, it will take a very long time to trickle down to distros and makes mono suck a lot of power. 
     
      Posted May 7, 2009 10:05 UTC (Thu)
                               by epa (subscriber, #39769)
                              [Link] (7 responses)
       
     
    
      Posted May 7, 2009 20:47 UTC (Thu)
                               by jmorris42 (guest, #2203)
                              [Link] (6 responses)
       
And it is people like you that got us into the situation where a five year old PC can't run a current Linux distro well.  So much for being a viable option for old, obsolete low spec hardware, and there goes the 3rd world. 
And it is people like you that have probably, in the end, cost us the netbook and smartphone battle.  You see, in some use cases it isn't a question of the cost of more RAM, it is the power consumption.  Keeping gigs of RAM alive in a smartphone isn't currently possible and it kills battery life like crazy in a netbook. 
Time for unpleasant truth time.  Put (your favorite linux distro) and XP on a netbook.  Which one boots faster?  Which one logs in faster?  Which one posts the start menu faster (with icons).  Odds are XP will win one or more of those races.  We got fat dumb and stupid, thats what happened.  It wasn't always like this, I remember when Linux was lean and fast.  To the people who wanted to make Linux 'just like Windows' all I have to say is Congratulations, you succeeded. 
Now imagine that we would have had the sense years ago to look at a clock applet chewing through more RAM than you needed to run a complete UNIX environment in and said , "That is insane!"  I can promise you Linksys wouldn't have had to toss Linux off their WRT line for being too fat.  We would totally OWN the netbook and smartphone space. 
     
    
      Posted May 7, 2009 22:54 UTC (Thu)
                               by nix (subscriber, #2304)
                              [Link] (1 responses)
       
(And that's non-ECCRAM costs. If you care about your data, with modern RAM  
 
     
    
      Posted May 8, 2009 17:14 UTC (Fri)
                               by epa (subscriber, #39769)
                              [Link] 
       
     
      Posted May 7, 2009 23:08 UTC (Thu)
                               by dlang (guest, #313)
                              [Link] 
       
it's still possible to run linux on low-spec systems, but you aren't going to run Gnome or KDE (or openoffice or firefox) and be at all happy. 
if you have a server-type application that doesn't need this, or are willing to do your own GUI stuff (a media box for example), linux can still do the job just fine. 
but if you try to install Red Hat, Suse or Ubuntu on the system and start from there you have  two strikes against you when you start (Debian isn't as bad right now, but it's also not nearly as good as it used to be) 
it's not just that packages are larger than they used to be, it's that many more things have been declared 'essential' that didn't used to be. 
     
      Posted May 8, 2009 17:27 UTC (Fri)
                               by epa (subscriber, #39769)
                              [Link] (2 responses)
       
     
    
      Posted May 11, 2009 9:36 UTC (Mon)
                               by etienne_lorrain@yahoo.fr (guest, #38022)
                              [Link] (1 responses)
       
     
    
      Posted May 21, 2009 17:08 UTC (Thu)
                               by epa (subscriber, #39769)
                              [Link] 
       
     
      Posted May 3, 2009 16:58 UTC (Sun)
                               by alankila (guest, #47141)
                              [Link] 
       
Mono has a compacting collector, last time I looked, but it still wasn't used by default due to some problems that seem esoteric enough to not warrant going into here... 
     
    Memory leaks in managed languages
      
On the other hand, I verified with valgrind that gnote is a leaky C++ program.
"""
0     32M
1     43M
2     50M
3     57M
4     63M
5     69M
6     75M
7     82M
8     88M
9     94M
10    101M
      It looks quite similar; as reported by top, the virtual memory size for gnote rises to 110m with ten notes displayed.  My guess is that the memory is mostly used by the gtk+ objects, which are probably just the same for the two programs.
Memory leaks in managed languages
      Memory leaks in managed languages
      
the virtual memory size for gnote rises to 110m with ten notes displayed
"""
      I get 53M for RSS with ten notes displayed.
      
          Memory leaks in managed languages
      Memory leaks in managed languages
      
      The virtual memory usage as reported by top and ps includes shared libraries, among other things, which is why the numbers do not add up to a necessarily meaningful number. No conspiracy here. ;-)
      
          Memory leaks in managed languages
      Accounting memory usage
      
Accounting memory usage
      
Accounting memory usage
      glibc at the moment makes quite a difference "on average".
      
          Memory leaks in managed languages
      
Memory leaks in managed languages
      
Memory leaks in managed languages
      
Memory leaks in managed languages
      
Memory leaks in managed languages
      
Memory leaks in managed languages
      
Memory leaks in managed languages
      
> it uses seven cents of memory?
Memory leaks in managed languages
      
letter? A hundred words costs ten dollars?!
volumes, you really should get ECCRAM, even if it is slower.)
Memory leaks in managed languages
      A hundred words costs ten dollars?!
Obviously not - I doubt the memory usage for a hundred-word note would be significantly more than a note saying 'hello'.  It would still use about seven cents' worth of memory, or perhaps ten cents.
      
          Memory leaks in managed languages
      
      Even a netbook these days comes with 512 megs of RAM as standard, and happily runs big applications such as Firefox and Openoffice (they are both installed as standard on my year-old Acer).  I appreciate the sentiment you express that everything has gotten slower and bloatier, but really, this is a standard old-timer's complaint that was just as valid in 1999, 1989 or even 1979 as it is now.  If there is a need to make the system faster and less power-hungry, then as with optimizing any software, it's best done by profiling to find the high-priority fixes, not by microoptimizing based on guesses or moral outrage about a particular app's memory usage.
Memory leaks in managed languages
      Now imagine that we would have had the sense years ago to look at a clock applet chewing through more RAM than you needed to run a complete UNIX environment in and said , "That is insane!"
This was indeed my reaction many years ago when I first saw that xclock used over three megabytes of memory.  Yet somehow the trusty old xclock is now considered lean and mean by today's standards.
      
          Memory leaks in managed languages
      
 Cache memory is expensive - that is why processor price vary so much.
 In simple words, if a piece of code is completely in the first level cache (code and data), you can consider it done; if it is in the second level cache, it will be done really soon; if it is elsewhere, you will need to wait.
 That is independant of the micro-optimisations like number of assembly instructions on a function, but it means you have to take care of the "locality" principle which is the memory cache design principle.
 Basically, to write good code, you should not page-in (into the cache) more than what you need, by correctly organising your data (all you will need will be on the same cache lines) and by correctly inlining functions.
 Also, to be a "good citizen", you should not completely trash the memory cache just to display the time...
Memory leaks in managed languages
      
Memory leaks in managed languages
      
 
           