User: Password:
|
|
Subscribe / Log in / New account

4K, why not also 64K?

4K, why not also 64K?

Posted Mar 12, 2009 15:30 UTC (Thu) by BenHutchings (subscriber, #37955)
In reply to: 4K, why not also 64K? by zmi
Parent article: Linux and 4K disk sectors

The x86 basic page size is fixed at 4K, and paging to disks with a larger block size is problematic. However, the ATA-8 standard does allow for larger block sizes, and Linux will probably adapt to this at some point.


(Log in to post comments)

4K, why not also 64K?

Posted Mar 12, 2009 15:40 UTC (Thu) by clugstj (subscriber, #4020) [Link]

So, x86 brain damage will constrain us for at least another generation?

4K, why not also 64K?

Posted Mar 12, 2009 17:49 UTC (Thu) by james (subscriber, #1325) [Link]

4K page sizes are not necessarily brain damage. They're a tradeoff: with 64K pages, you may get sixteen times more memory in the same size TLBs, but you lose a lot of memory if you're dealing with a lot of small datastructures that have to be page-aligned -- mmap for example.

Linus Torvalds has a classic rant on the subject at realworldtech.com (the rant starts a page into his post):

I've actually done the math. Even 64kB pages is totally useless for a lot of file system access stuff: you need to do memory management granularity on a smaller basic size, because otherwise you just waste all your memory on unused left-over-space.
and
So reasonable page sizes range from 4kB to 16kB (and quite frankly, 16kB is pushing it - exactly because it has fragmentation issues that blow up memory usage by a huge amount on some loads). Anything bigger than that is no longer useful for general-purpose file access through mmap, for example.
and
For a particular project I care about, if I were to use a cache granularity of 4kB, I get about 20% lossage due to memory fragmentation as compared to using a 1kB allocation size, but hey, that's still ok. For 8kB blocks, it's another 21% memory fragmentation cost on top of the 4kB case. For 16kB, about half the memory is wasted on fragmentation. For 64kB block sizes, the project that takes 280MB to cache in 4kB blocks, now takes 1.4GB!
James.

Linus Torvalds on realworldtech

Posted Mar 12, 2009 19:02 UTC (Thu) by anton (subscriber, #25547) [Link]

Linus Torvalds has a classic rant on the subject at realworldtech.com
Is the Linus Torvalds on realworldtech.com the same person as the well-known author of Linux?

Linus Torvalds on realworldtech

Posted Mar 12, 2009 21:12 UTC (Thu) by james (subscriber, #1325) [Link]

It seems highly likely, yes.

He's got the same use of English (it sounds like Linus), the same technical knowledge, the same opinions on various processors (especially Itanium), the same enjoyment of a good flame war, posts using the torvalds-at-linux-foundation.org email address (although the site doesn't necessarily validate those), and some of these posts have been the subject of news reports on mainstream IT websites (so there's a fair chance that Linus would hear about those posts).

Andi Kleen posts there too (or, again, someone calling himself Andi Kleen), and some of the other posters are also very knowledgeable. So there's a good chance that any slips would be noticed.

If it is an imposter, he's managed to keep a lot of people fooled for a long time over a lot of arguments.

So we can't be as certain as we can that, for example, Alan Cox isn't really a whole load of little Welsh gnomes hiding down disused Welsh coal-mines, but it's pretty likely.

Linus Torvalds on realworldtech

Posted Mar 13, 2009 0:35 UTC (Fri) by njs (guest, #40338) [Link]

>So we can't be as certain as we can that, for example, Alan Cox isn't really a whole load of little Welsh gnomes hiding down disused Welsh coal-mines

...and how do we know that?

Linus Torvalds on realworldtech

Posted Mar 13, 2009 23:23 UTC (Fri) by jd (guest, #26381) [Link]

My information is that they're Cornish Pixies hiding down Welsh Coal Mines. Freshly picked Cornish Pixies.

4K, why not also 64K?

Posted Mar 12, 2009 19:11 UTC (Thu) by zmi (guest, #4829) [Link]

> I've actually done the math. Even 64kB pages is totally useless for
> a lot of file system access stuff: you need to do memory management
> granularity on a smaller basic size, because otherwise you just
> waste all your memory on unused left-over-space.

Depends on the design of the FS. Example: ReiserFS already combined the endings of several files in a single 4K disk block and thus saved a *lot* of disk space. And for a company using the server to store their documents, there won't be too many files anymore with <64KiB (yes I know with 65KiB you still loose 63KiB), but those should be fast. Speed starts to be a limitation, while disk space is not (hey, just put another terabyte disk into the RAID).

Regarding memory page size: I don't understand why that limits a FS block size, there's scatter/gather I/O and a 64KiB block from disk doesn't need to be linear in memory. I'm not a coder, but I think that limitation should be resolvable.

zmi


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds