User: Password:
Subscribe / Log in / New account

Temporary files: RAM or disk?

Temporary files: RAM or disk?

Posted Jun 25, 2012 9:40 UTC (Mon) by Serge (guest, #84957)
In reply to: Temporary files: RAM or disk? by nix
Parent article: Temporary files: RAM or disk?

> I thought the idea of per-user /tmp was that every user got his own /tmp, sure, but this was implemented via subdirectories of the *real*, tmpfs, cleared-on-boot /tmp.

You don't need tmpfs then. This will work with /tmp anywhere (disk, ram, separate partition, nfs, etc). I mean this is neither a reason to use tmpfs nor it's a reason to avoid it.

> One application that becomes a lot faster with /tmp on tmpfs is GCC without -pipe, or, even with -pipe, at the LTO link step.

Faster linking? Let's check that with something having a lot of binaries:
mount tmpfs or ext3 to /mnt/test, then
$ cd /mnt/test
$ wget
$ export CFLAGS='-O2 -g -flto' TMPDIR=/mnt/test
$ time sh -c "tar xf coreutils-8.17.tar.xz; cd coreutils-8.17; ./configure; make install DESTDIR=/mnt/test/root; cd ../root; tar czf ../coreutils-package.tar.gz *; cd ..; rm -rf coreutils-8.17 root"

tmpfs results:
real 882.876s user 760.111s sys 110.353s
real 884.456s user 761.408s sys 110.603s
real 885.245s user 762.770s sys 110.525s
real 884.914s user 762.417s sys 110.395s
real 885.352s user 762.865s sys 110.360s

ext3 results:
real 895.244s user 762.620s sys 115.027s
real 893.134s user 762.447s sys 114.841s
real 898.353s user 763.645s sys 116.369s
real 898.010s user 763.472s sys 116.074s
real 897.525s user 763.671s sys 116.219s

If my test is correct, it's still same 1-2%. It is faster, but not a lot.

(Log in to post comments)

Temporary files: RAM or disk?

Posted Jun 26, 2012 15:49 UTC (Tue) by nix (subscriber, #2304) [Link]

[lots of crude benchmarking ahead.]

It's not just linking that a tmpfs /tmp speeds up a bit, in theory: it's compilation, because without -pipe GCC writes its intermediate .S file to TMPDIR (and -pipe is not the default: obviously it speeds up compilation by allowing extra parallelism as well as reducing potential disk I/O, so I don't quite understand *why* it's still not the default, but there you are.)

btw, coreutils is by nobody's standards 'something having a lot of binaries'. It has relatively few very small binaries, few object files, and an enormous configure script that takes about 95% of the configure/make time (some of which, it is true, runs the compiler and writes to TMPDIR, but most of which is more shell-dependent than anything). LTO time will also have minimal impact in this build.

But, you're right, I'm pontificating in the absence of data -- or data less than eight years old, anyway, as the last time I measured this was in 2004. That's so out of date as to be useless. Time to measure again. But let's use some more hefty test cases than coreutils, less dominated by weird marginal workloads like configure runs.

Let's try a full build of something with more object files, and investigate elapsed time, cpu+sys time, and (for non-tmpfs) disk I/O time as measured from /proc/diskstats (thus, possibly thrown off by cross-fs merging: this is unavoidable, alas). A famous old test, the kernel (hacked to not use -pipe, with hot cache), shows minimal speedup, since the kernel does a multipass link process and writes the intermediates to non-$TMPDIR anyway:

tmpfs TMPDIR, with -pipe (baseline): 813.75user 51.28system 2:13.32elapsed
tmpfs TMPDIR: 812.23user 50.62system 2:12.96elapsed
ext4 TMPDIR: 809.74user 51.90system 2:29.15elapsed 577%CPU; TMPDIR reads: 11, 88 sectors; writes: 6394, 1616928 sectors; 19840ms doing TMPDIR I/O.

So, a definite effect, but not a huge one. I note that the effect of -pipe is near-nil these days, likely because the extra parallelism you get from combining the compiler and assembler is just supplanting the extra parallelism you would otherwise get by running multiple copies of the compiler in parallel via make -j. (On a memory-constrained or disk-constrained system, where the useless /tmp writes may contend with useful disk reads, and where reads may be required as well, we would probably see a larger effect, but this system has 24Gb RAM and a caching RAID controller atop disks capable of 250Mb/s in streaming write, so it is effectively unconstrained, being quite capable of holding the whole source tree and all build products in RAM simultaneously. So this is intentionally a worst case for my thesis. Smaller systems will see a larger effect. Most systems these days are not I/O- or RAM-constrained when building a kernel, anyway.)

How about a real 900kg monster of a test, GCC? This one has everything, massive binaries, massive numbers of object files, big configure scripts writing to TMPDIR run in parallel with ongoing builds, immense link steps, you name it: if there is an effect this will show it. (4.6.x since that's what I have here right now: full x86_64/x86 multilibbed biarch nonprofiled -flto=jobserver -j 9 bootstrap including non-multilib libjava, minus testsuite run: hot cache forced by cp -a'ing the source tree before building; LTO is done in stage3 but in no prior stages so as to make the comparison with the next test a tiny bit more meaningful: stage2/3 comparison is suppressed for the same reason):

tmpfs TMPDIR: 13443.91user 455.17system 36:02.86elapsed 642%CPU
ext4 TMPDIR: 13322.24user 514.38system 36:01.62elapsed 640%CPU; TMPDIR reads: 59, 472 sectors; writes: 98661, 20058344 sectors; 83690ms doing TMPDIR I/O

So, no significant effect elapsed-time-wise, well into the random noise: though the system time is noticeably higher for the non-tmpfs case, it is hugely dominated by the actual compilation. However, if you were doing anything else with the system you would have noticed: paging was intense, as you'd expect with around 10Gb of useless writes being flushed to disk. Any single physical disk would have been saturated, and a machine with much less memory would have been waiting on it.

That's probably the most meaningful pair of results here, a practical worst case for the CPU overhead of non-tmpfs use. Note that the LTO link stage alone writes around six gigabytes to TMPDIR, with peak usage at any one time around 4Gb, and most of this cannot be -pipe'd (thus this is actually an example of something that on many machines cannot be tmpfsed effectively).

Copyright © 2018, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds