|
|
Subscribe / Log in / New account

Disk space (was: SELF: Anatomy of an (alleged) failure)

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 24, 2010 17:03 UTC (Thu) by chad.netzer (subscriber, #4257)
In reply to: Disk space (was: SELF: Anatomy of an (alleged) failure) by cesarb
Parent article: SELF: Anatomy of an (alleged) failure

Which reminds me, why aren't loadable binaries compressed on disk, and uncompressed on the fly? Surely any decompression overhead is lower than a rotating storage disk seek, and common uncompressed binaries would get cached anyways. I suppose it's because it conflicts with mmap() or something.


to post comments

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 24, 2010 19:40 UTC (Thu) by dlang (guest, #313) [Link] (8 responses)

it all depends on your system.

do you have a SSD?

are you memory constrained (decompressing requires that you have more space than the uncompressed image)

do you page out parts of the code and want to read in just that page later (if so, you would have to uncompress the entire binary to find the appropriate page)

what compression algorithm do you use? many binaries don't actually compress that well, and some decompression algorithms (bzip2 for example) are significantly slower than just reading the raw data.

I actually test this fairly frequently in dealing with processing log data. in some condititions having the data compressed and uncompressing it when you access it is a win, in other cases it isn't.

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 25, 2010 0:14 UTC (Fri) by chad.netzer (subscriber, #4257) [Link] (7 responses)

Yeah, I made reference to some of the gotchas (spindles, mmap/paging). Actually, it sounds like the kind of thing that, should you care about it, is better handled by a compressed filesystem mounted onto the bin directories, rather than some program loader hackery.

Still, why the heck must my /bin/true executable take 30K on disk? And /bin/false is a separate executable that takes *another* 30K, even though they are both dynamically linked to libc??? Time to move to busybox on the desktop...

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 25, 2010 0:38 UTC (Fri) by dlang (guest, #313) [Link] (4 responses)

re: size of binaries

http://www.muppetlabs.com/~breadbox/software/tiny/teensy....

A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 25, 2010 2:41 UTC (Fri) by chad.netzer (subscriber, #4257) [Link] (3 responses)

Yes, I'm familiar with that old bit of cleverness. :) Note that the GNU coreutils stripped /bin/true and /bin/false executables are more than an order of magnitude larger than the *starting* binary that is whittled down in that demonstration. Now, *that* is code bloat.

To be fair getting your executable much smaller than the minimal disk block size is just a fun exercise. Whereas coreutils /bin/true may actually benefit from an extent based filesystem. :) Anyway, it's just a silly complaint I'm making, though it has always annoyed me a tiny bit.

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 25, 2010 12:25 UTC (Fri) by dark (guest, #8483) [Link] (2 responses)

Yes, but GNU true does so much more! It supports --version, which tells you all about who wrote it and about the GPL and the FSF. It also supports --help, which explains true's command-line options (--version and --help). Then there is the i18n support, so that people from all over the world can learn about --help and --version. You just don't get all that with a minimalist ELF binary.

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 25, 2010 15:38 UTC (Fri) by intgr (subscriber, #39733) [Link] (1 responses)

Indeed, I use those features every day! ;)

PS: Shells like zsh actually ship builtin "true" and "false" commands

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 29, 2010 23:03 UTC (Tue) by peter-b (guest, #66996) [Link]

So does POSIX sh. The following command is equivalent to true:

:

The following command is equivalent to false:

! :

I regularly use both when writing shell scripts.

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 27, 2010 16:42 UTC (Sun) by nix (subscriber, #2304) [Link]

There are two separate binaries because the GNU Project thinks it is confusing to have single binaries whose behaviour changes depending on what name they are run as, even though this is ancient hoary Unix tradition. Apparently people might go renaming the binaries and then get confused when they don't work the same. Because we do that all the time, y'know.

(I think this rule makes more sense on non-GNU platforms, where it is common to rename *everything* via --program-prefix=g or something similar, to prevent conflicts with the native tools. But why should those of us using the GNU toolchain everywhere be penalized for this?)

Disk space (was: SELF: Anatomy of an (alleged) failure)

Posted Jun 27, 2010 16:46 UTC (Sun) by nix (subscriber, #2304) [Link]

The size, btw, is probably because the gnulib folks have found bugs in printf which the glibc folks refuse to fix (they only cause buffer overruns or bus errors on invalid input, after all, how problematic could that be?) so GNU software that uses gnulib will automatically replace glibc's printf with gnulib's at configure time. (That this happens even for things like /bin/true, which will never print the class of things that triggers the glibc printf bugs, is a flaw, but not a huge one.)

And gnulib, because it has no stable API or ABI, is always statically linked to its users.

26Kb for a printf implementation isn't bad.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds