User: Password:
|
|
Subscribe / Log in / New account

Temporary files: RAM or disk?

Temporary files: RAM or disk?

Posted Jun 16, 2012 4:30 UTC (Sat) by Serge (guest, #84957)
In reply to: Temporary files: RAM or disk? by TRauMa
Parent article: Temporary files: RAM or disk?

> I thought the plan was to migrate to per-user-tmp anyway, somewhere in $HOME, for apps that use a lot of tmp like DVD rippers this would be a good idea anyway.

Per-user directory would not get cleaned on reboot. Using per-user temporary directory may be a bad thing for users with NFS /home, they would prefer using local tmp if it is. Also a common /tmp for all users still needed for file exchange on a multiuser servers. And finally, why would DVD soft used something-in-HOME, if they can use /tmp which is there exactly for those things. ;)

Why put /tmp on tmpfs? Having /var/tmp/portage on tmpfs does not force you to put /tmp there. And it's really hard to find an application that becomes faster just because of /tmp on tmpfs. Even for portage it's not that obvious.

> Compiles on tmpfs are faster, factor is 1.8 to 2 in my tests

Hm... My simple test shows that tmpfs is just about 1-2% faster.
Here's the script to resemble a basic package build:
mount tmpfs or ext3 to /mnt/test, then
$ cd /mnt/test
$ wget http://curl.haxx.se/download/curl-7.26.0.tar.bz2
$ export CFLAGS='-O2 -g -pipe' CXXFLAGS='-O2 -g -pipe'
$ time sh -c 'tar xf curl-7.26.0.tar.bz2 && cd curl-7.26.0 && ./configure && make install DESTDIR=/mnt/test/root && cd ../root && tar czf ../curl-package.tar.gz * && cd .. && rm -rf curl-7.26.0 root'

tmpfs results:
real 70.983s user 48.685s sys 26.527s
real 70.635s user 48.390s sys 26.694s
real 70.701s user 48.203s sys 26.929s
real 70.867s user 48.636s sys 27.090s
real 70.744s user 48.297s sys 27.082s

ext3 results:
real 71.690s user 48.401s sys 27.498s
real 71.614s user 48.340s sys 27.869s
real 71.531s user 48.836s sys 27.520s
real 71.479s user 48.306s sys 27.469s
real 71.635s user 48.540s sys 27.496s

What have I missed?


(Log in to post comments)

Temporary files: RAM or disk?

Posted Jun 16, 2012 13:44 UTC (Sat) by nix (subscriber, #2304) [Link]

I thought the idea of per-user /tmp was that every user got his own /tmp, sure, but this was implemented via subdirectories of the *real*, tmpfs, cleared-on-boot /tmp, e.g. /tmp/user-$name/... This can all be done fairly easily with pam_namespace: there's even an example in the default /etc/security/namespace.conf.

(One application that becomes a lot faster with /tmp on tmpfs is GCC without -pipe, or, even with -pipe, at the LTO link step. It writes really quite a lot of large extremely temporary intermediate output to files in /tmp in each stage of the processing pipeline, then reads it back again in the next stage.)

Temporary files: RAM or disk?

Posted Jun 25, 2012 9:40 UTC (Mon) by Serge (guest, #84957) [Link]

> I thought the idea of per-user /tmp was that every user got his own /tmp, sure, but this was implemented via subdirectories of the *real*, tmpfs, cleared-on-boot /tmp.

You don't need tmpfs then. This will work with /tmp anywhere (disk, ram, separate partition, nfs, etc). I mean this is neither a reason to use tmpfs nor it's a reason to avoid it.

> One application that becomes a lot faster with /tmp on tmpfs is GCC without -pipe, or, even with -pipe, at the LTO link step.

Faster linking? Let's check that with something having a lot of binaries:
mount tmpfs or ext3 to /mnt/test, then
$ cd /mnt/test
$ wget http://ftp.gnu.org/gnu/coreutils/coreutils-8.17.tar.xz
$ export CFLAGS='-O2 -g -flto' TMPDIR=/mnt/test
$ time sh -c "tar xf coreutils-8.17.tar.xz; cd coreutils-8.17; ./configure; make install DESTDIR=/mnt/test/root; cd ../root; tar czf ../coreutils-package.tar.gz *; cd ..; rm -rf coreutils-8.17 root"

tmpfs results:
real 882.876s user 760.111s sys 110.353s
real 884.456s user 761.408s sys 110.603s
real 885.245s user 762.770s sys 110.525s
real 884.914s user 762.417s sys 110.395s
real 885.352s user 762.865s sys 110.360s

ext3 results:
real 895.244s user 762.620s sys 115.027s
real 893.134s user 762.447s sys 114.841s
real 898.353s user 763.645s sys 116.369s
real 898.010s user 763.472s sys 116.074s
real 897.525s user 763.671s sys 116.219s

If my test is correct, it's still same 1-2%. It is faster, but not a lot.

Temporary files: RAM or disk?

Posted Jun 26, 2012 15:49 UTC (Tue) by nix (subscriber, #2304) [Link]

[lots of crude benchmarking ahead.]

It's not just linking that a tmpfs /tmp speeds up a bit, in theory: it's compilation, because without -pipe GCC writes its intermediate .S file to TMPDIR (and -pipe is not the default: obviously it speeds up compilation by allowing extra parallelism as well as reducing potential disk I/O, so I don't quite understand *why* it's still not the default, but there you are.)

btw, coreutils is by nobody's standards 'something having a lot of binaries'. It has relatively few very small binaries, few object files, and an enormous configure script that takes about 95% of the configure/make time (some of which, it is true, runs the compiler and writes to TMPDIR, but most of which is more shell-dependent than anything). LTO time will also have minimal impact in this build.

But, you're right, I'm pontificating in the absence of data -- or data less than eight years old, anyway, as the last time I measured this was in 2004. That's so out of date as to be useless. Time to measure again. But let's use some more hefty test cases than coreutils, less dominated by weird marginal workloads like configure runs.

Let's try a full build of something with more object files, and investigate elapsed time, cpu+sys time, and (for non-tmpfs) disk I/O time as measured from /proc/diskstats (thus, possibly thrown off by cross-fs merging: this is unavoidable, alas). A famous old test, the kernel (hacked to not use -pipe, with hot cache), shows minimal speedup, since the kernel does a multipass link process and writes the intermediates to non-$TMPDIR anyway:

tmpfs TMPDIR, with -pipe (baseline): 813.75user 51.28system 2:13.32elapsed
tmpfs TMPDIR: 812.23user 50.62system 2:12.96elapsed
ext4 TMPDIR: 809.74user 51.90system 2:29.15elapsed 577%CPU; TMPDIR reads: 11, 88 sectors; writes: 6394, 1616928 sectors; 19840ms doing TMPDIR I/O.

So, a definite effect, but not a huge one. I note that the effect of -pipe is near-nil these days, likely because the extra parallelism you get from combining the compiler and assembler is just supplanting the extra parallelism you would otherwise get by running multiple copies of the compiler in parallel via make -j. (On a memory-constrained or disk-constrained system, where the useless /tmp writes may contend with useful disk reads, and where reads may be required as well, we would probably see a larger effect, but this system has 24Gb RAM and a caching RAID controller atop disks capable of 250Mb/s in streaming write, so it is effectively unconstrained, being quite capable of holding the whole source tree and all build products in RAM simultaneously. So this is intentionally a worst case for my thesis. Smaller systems will see a larger effect. Most systems these days are not I/O- or RAM-constrained when building a kernel, anyway.)

How about a real 900kg monster of a test, GCC? This one has everything, massive binaries, massive numbers of object files, big configure scripts writing to TMPDIR run in parallel with ongoing builds, immense link steps, you name it: if there is an effect this will show it. (4.6.x since that's what I have here right now: full x86_64/x86 multilibbed biarch nonprofiled -flto=jobserver -j 9 bootstrap including non-multilib libjava, minus testsuite run: hot cache forced by cp -a'ing the source tree before building; LTO is done in stage3 but in no prior stages so as to make the comparison with the next test a tiny bit more meaningful: stage2/3 comparison is suppressed for the same reason):

tmpfs TMPDIR: 13443.91user 455.17system 36:02.86elapsed 642%CPU
ext4 TMPDIR: 13322.24user 514.38system 36:01.62elapsed 640%CPU; TMPDIR reads: 59, 472 sectors; writes: 98661, 20058344 sectors; 83690ms doing TMPDIR I/O

So, no significant effect elapsed-time-wise, well into the random noise: though the system time is noticeably higher for the non-tmpfs case, it is hugely dominated by the actual compilation. However, if you were doing anything else with the system you would have noticed: paging was intense, as you'd expect with around 10Gb of useless writes being flushed to disk. Any single physical disk would have been saturated, and a machine with much less memory would have been waiting on it.

That's probably the most meaningful pair of results here, a practical worst case for the CPU overhead of non-tmpfs use. Note that the LTO link stage alone writes around six gigabytes to TMPDIR, with peak usage at any one time around 4Gb, and most of this cannot be -pipe'd (thus this is actually an example of something that on many machines cannot be tmpfsed effectively).


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds