LWN.net Logo

Real-life optimization work

Real-life optimization work

Posted Nov 2, 2005 22:54 UTC (Wed) by cdmiller (subscriber, #2813)
In reply to: Real-life optimization work by jwb
Parent article: All hail the speed demons (O'Reillynet)

Point well taken. You don't see me running Wordperfect 3 from the floppy on my Kaypro lunchbox these days, and the 386 is a motherboard in a box somewhere in the basement.

While I dislike what I perceive as the "bloat" in todays software, I'm certainly not confident I could do better. No insult(s) intended to any developers or products. Just my observation of what old software running on limited resources looks like compared to todays stuff on the modern readily available hardware. My Afterstep says it has 20 Meg resident, my first linux and X computer had 8 Meg of RAM and ran fvwm.

Anyhow, kudos to the folks taking on the profiling and optimization tasks.


(Log in to post comments)

Real-life optimization work

Posted Nov 3, 2005 0:06 UTC (Thu) by smoogen (subscriber, #97) [Link]

Having seen cross platform projects.. I have found that a lot of "bloat" people see comes from a lot of the hardware, the requirements, and the time to produce itself. In cases where a simple C program was compiled on an 8 bit computer.. you would see it grow 4x-16x on a 16 bit computer and 64x-128x on a 32 bit. This wasnt including anything like True Type fonts and UTF. We had a 1 mb browser that when we wanted UTF-8 it became a 32 mbit monstrosity at first because it had to deal with a whole bunch of rules that [left to right, right to left, and up and down rules plus certain checks that are language specific.] We supported 20 languages before as long as they were western european languages. Doing any of the eastern languages added tons of complexity.

We spent 3 months to get it down to a reasonable size (as this was the days of Pentium 60's) but basically ended up with shipping the large product because the 'optimizations' kept making it look like crap.

Heck, want to speed and smallify up Linux.. dictate that the world uses ASCII C only as K&R meant it to.

Real-life optimization work

Posted Nov 3, 2005 10:03 UTC (Thu) by nix (subscriber, #2304) [Link]

Indeed. A lot of this is increased alignment constraints, but in binaries as opposed to in memory a pile is caused directly by increased address sizes. e.g.:
-rwxr-xr-x  1 nix users 1165752 Nov  3 09:55 32/libcrypto.so.0.9.7
-rwxr-xr-x  1 nix users 1398112 Nov  3 09:55 64/libcrypto.so.0.9.7
That's two stripped UltraSPARC binaries, both built with -mcpu=ultrasparc (thus using almost identical instructions), one built with -m32 and one with -m64 with a biarch GCC. Major differences are thus alignment of data (25Kb size difference) code (20Kb size difference)... and relocations (100Kb difference: the 64-bit relocation sections are twice the size, because they're basically big tables of addresses and all the addresses have doubled in size).

Real-life optimization work

Posted Nov 3, 2005 19:55 UTC (Thu) by mcm (guest, #31917) [Link]

i guess the relocations could be compressed, as they can probably be represented as 32-bit offsets to a 64-bit base.

Real-life optimization work

Posted Nov 4, 2005 13:22 UTC (Fri) by nix (subscriber, #2304) [Link]

Indeed they could be compressed, but I think you might need a new relocation type for 64+32 base+offset... (I'm not sure and don't have the specs here).

Real-life optimization work

Posted Nov 5, 2005 3:11 UTC (Sat) by vonbrand (subscriber, #4458) [Link]

Sorry, but comparing the size of the binaries is useless. Use size(1) for that. Also, from what I understand, on SPARC 64-bit binaries are much larger due to larger constants (pointers, integers, ...) all over the place.

Besides, what is the point? To get anything running on an 8-bit machine was a challenge, lots of things you take for granted today weren't even the stuff of wet dreams then. You also have to remember that today the expensive part of the mix is people, not machine. Sure, one could develop mean and lean applications doing most of what today's software does. With enough care, you could even figure out how to include just the features people really use, and shave off quite a bit more. But the development would be a whole lot more expensive, just for letting a few MiB of RAM lay around unused for a change.

Real-life optimization work

Posted Nov 7, 2005 0:18 UTC (Mon) by nix (subscriber, #2304) [Link]

I size(1)d them, of course; I just didn't want to spray the result all over the comments page.

And, yes, I'd agree that normally shaving bytes off things isn't worth it: however, with that in mind I spent this weekend shaving a few bytes off one data structure in one program and reducing the number of instances of that data structure --- and reducing the program's peak memory consumption from many gigabytes to a few hundred Mb.

But microoptimizations without major results, or pervasive ones, are indeed generally not worth it.

Real-life optimization work

Posted Nov 3, 2005 3:17 UTC (Thu) by piman (subscriber, #8957) [Link]

The biggest tradeoff is not features but development time. If developers had twice the time to write twice the features, software would probably be faster and leaner. Instead, people want twice the features in half the time. Back in 2000 the chant was "good fonts!", not "good and fast fonts!" and now we're doing the other half of the work.

There is also something to be said for the average quality of programmers in 1980 versus the average quality today. There are more good programmers, but it's more common to have one good and five bad programmers writing your product, than two good programmers. Or sometimes just five bad programmers. On the other hand, we have a lot more software.

Real-life optimization work

Posted Nov 4, 2005 6:47 UTC (Fri) by zblaxell (subscriber, #26385) [Link]

There is also a shift in the nature of the programming task.

In 1989-1991 I wrote a personal calendering application in the best available programming tools for me at the time: 6809 assembler. From scratch. (OK, I had Unix-like system calls, but no library functions, not even math with integers larger than 16 bits).

The application contained many of the usual personal calendar features and some unusual ones: alarm notifications, recurring events, a categorization and prioritization scheme, expiration dates, interactive editing, printable sorted deadline lists, colored text, curses-like interface, etc. The particular combination of features was highly productive for me, and unfortunately a) I've never seen anyone else write a similar application, b) the source code is on an obsolete hard drive, and c) without it, I can't seem to organize my life to get the time to rewrite it.

One thing that happens when you manually type in 1300 assembler instructions is that you don't waste them. There was nothing in that code that didn't need to be there. I entered each instruction by hand, using no assembler macros, only function calls. Features were carefully designed to balance functional benefits against fairly painful coding cost--when 10% of your program is consumed by the functions that manipulate dates and intervals, you think twice before adding superfluous features, and you also find ways to *add* functionality by *removing* code.

This calendering application binary was about 3K. The smallest i386 binary I can get for the source code "int main(){return 0;}" is more than double that size, but it does less (now *that* is bloat ;-). Oddly enough, at the time I thought 3K was a huge investment in memory since it would be resident in RAM all the time.

If I cloned the old program line by line, but transliterated into C, it'd probably become 10 times larger (recall it became twice as large just by being replaced with a program that returns a constant integer). The i386 requires four bytes for memory addresses instead of two, many of the x86 instructions are longer than the 6809 equivalents, and C compilers don't usually find ways to exploit instructions that are designed for people who are writing date formatting functions by hand in assembler.

If I designed an equivalent program using the tools I'd normally use for binary software development today (C, curses, etc), it'd be 100 times larger. My program contains constant strings for terminal manipulation--this would be replaced with the while curses/termcap/terminfo/etc infrastructure. If I used malloc() instead of my own memory management library and ANSI C string functions instead of my own string management library the memory overhead on each event would double. localtime() and mktime() are considerably larger than my date manipulation library--my library didn't have to support time zones, for one thing. A lot of data that was stored in packed bit structures would end up being spread out over bytes, ints, or even text strings in a "modern" design.

On the other hand there is one saving--I won't need several hundred bytes of integer math library since modern CPU's come with these functions *built right into the hardware*. ;-)

If I designed an equivalent program in a scripting language, its source code might be somewhat smaller, but it will probably use more RAM at runtime than was available in the entire machine that used to run the application as a daemon--a bloat factor of over 200 (with a GUI, over 1000). It would also take me a single weekend, not three years, to write it.

But would the program do anything more? No. It would be the same little program, it would just be sitting on top of a mountain of accreted infrastructure.

Real-life optimization work

Posted Nov 6, 2005 0:58 UTC (Sun) by tialaramex (subscriber, #21167) [Link]

Are you /sure/ it wouldn't do more?

You see, it's so easy to write a Unicode-enabled, locale-sensitive program that you might easily do so by accident. Your new program might, without you really intending it, support a lot of extra things that a lot of people (maybe even you) would find useful. Things which weren't so much missing from the original as simply never considered. Remember also that the OS support functions are much more powerful and robust than their equivalents on your 6809. Depending on the APIs used your "save file" routine may magically support saving a compressed file, over the network, with automatic versioning...

Real-life optimization work

Posted Nov 8, 2005 8:21 UTC (Tue) by piman (subscriber, #8957) [Link]

You forgot to mention bloated things like file permissions and multiple terminals. :)

Also, Unix code (meaning all those things the grandparent eschewed, like malloc and localtime) written in 1989-1991 would take a couple days to port to a modern GNU/Linux distribution. And probably only a few days to port to whatever comes 15 years from now.

So would it do more? Yeah. To start with, it would run in the first place. And without that ability, source code of any size is worthless.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds