-flto enables link time optimisation, which is IMHO something slightly
different, and earlier (current?) implementation has a reputation of being
slow. I was thinking more about -combine -fwhole-program. As far as I know
link time optimisation is about doing further optimisations before the real
-fmem-report flag is indeed interesting. On my current project, which
is about 4K lines of C code, it reports 16MB allocated with combine +
whole-program, 12MB when using dietlibc, and max 3MB when compiling
files one at a time.
So assuming C++ is ten times worse, and the code ten times bigger, then
you're indeed easily using gigabytes of memory. I guess you don't want
to compile big C++ programs at once though, doing it per file should be
> That's exactly the class of allocations for which obstacks are good and
> GC can often be forgone. When you have long-lived allocations in a
> complex webwork in elaborate interconnected graphs, then is when GC
> becomes essential. And GCC has elaborate interconnected graphs up the
> wazoo. The quality of internal APIs is mostly irrelevant here: it's the
> nature of the data structures that matters.
With crappy APIs/design you can't allocate objects on the stack, but are
forced to allocate them dynamically even if they're short lived.
The problem of elaborate interconnected graphs is that it's hard to end
up with nodes with no references at all, so GC usually won't help much.
And even if it does it probably doesn't reduce peak memory usage. So yes,
in such cases you want something like GC, for your own sanity, but it
won't improve the total memory usage much if you don't limit the graph
size some other way.