|
|
Log in / Subscribe / Register

Why kernel.org is slow

Why kernel.org is slow

Posted Jan 11, 2007 13:19 UTC (Thu) by davecb (subscriber, #1574)
Parent article: Why kernel.org is slow

Directory performance is a long-lived issue with
Unix-derived operating systems, and a known
hard problem even in the research world: Andy
Tannenbaum's "amoeba" team have some interesting
publications on the subject.

In a previous life, the low-hanging fruit in
in-memory directory structures were:
- The time to find that a file does not exist
Ironically, NTFS does it better, with an ordered
(actually b-tree) structure, but one can get
surprising improvements by sorting just the
in-memory form of the structure.
- Searching for something which does exist: as above.
- Using the full generality of locking for an
update to a single directory entry. Renaming
to an equal-length or shorter name is a common
case which can be done with minimal locking
(depending on your locking structure: YMMV (:-))
- reader-writer locks, for some sense of that phrase.
Getting the right sense seems to be rather subtle, but
the read speed that kernel.org needs can be directly
adressed here, and finally.
- lock-free and low-lock schemes, optimal for the
combintion of reader-writer and fast in-memory
access, for all of the above.
It is understood that the last is something of a challenge (;-))

--dave


to post comments

Why kernel.org is slow

Posted Jan 11, 2007 13:48 UTC (Thu) by etienne_lorrain@yahoo.fr (guest, #38022) [Link] (1 responses)

If the problem is linked to read/write access and locks, would it be good (when there is a lot of read and few writes like for the Linux versions), to keep the filesystem mounted read-only most of the time?
I mean, keep the partition containing data read-only, then to update do:
mount -o remount,rw /server/data
cp -ra new_linux_version /server/data
sync
mount -o remount,ro /server/data

Just curious,
Etienne.

Why kernel.org is slow

Posted Jan 11, 2007 16:01 UTC (Thu) by davecb (subscriber, #1574) [Link]

Hmmn, does someone know if Linux directory locks are never held
on read-only media? I know zfs locks at the directory-entry
level (see http://src.opensolaris.org/source/xref/loficc/crypto/usr/... ) but UFSs generally lock the in-memory directory, and don't know if it
comes from RO or RW media...
Anyone know ext3 that well?

--dave

Why kernel.org is slow

Posted Jan 13, 2007 13:20 UTC (Sat) by ebiederm (subscriber, #35028) [Link] (2 responses)

Linux appears to do a much better job than the NT kernel for
the in memory data structures. A cheap way to see this is to
run git on a windows system. There is an order of magnitude
performance hit for directory sensitive things. I don't believe
that is just cygwin.

ext3 for large directories hashes the filename and looks it up in a
btree. Using a hash of the filename results in a better branching
factor in your btree. So the on-disk data structures are not at
a disadvantage.

I haven't looked but ext2+ directories should all be kept in the same
block group which is roughly a single disk track. So even with
fragmentation the disk track cache should work well. I don't remember
if block groups are small enough so that they always map to the
same disk track though.

So I'm pretty certain the issue is the large directories the inode
semaphore.

Read-ahead should help a lot if the pages don't get thrown out before we
use them.

Changing the locking to allow more concurrency is a trickier problem.
If done right my gut feel is that you should be able to operate
essentially lock free, with multiple concurrent writes and reads going on
simultaneously. The readdir semantics allow for it. But anything with a
high degree of concurrency comes with tricky corner cases.

Eric

Why kernel.org is slow

Posted Jan 13, 2007 15:44 UTC (Sat) by davecb (subscriber, #1574) [Link]

Excellent!

I'd be inclined to say that lock-free algorithms
might be a solution to look closely at... more
speculation after I've had a chance to think
about it (;-))

--dave

Why kernel.org is slow

Posted Jan 14, 2007 10:25 UTC (Sun) by evgeny (subscriber, #774) [Link]

> I don't believe that is just cygwin.

Comparing with [an app running under] cygwin is unfair. Try watching a configure script under cygwin and natively - the difference can easily be a factor of ten.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds