LWN.net Logo

Scalability is a double edged sword...

From:  Duncan Simpson <dps@io.stargate.co.uk>
To:  letters@lwn.net
Subject:  Scalabitiy is a double edged sword...
Date:  Thu, 11 Jul 2002 11:26:30 +0100

 
A unix kernel with very fine grained locking exists already. It is called
solaris and nobody is impressed with its performance on small systems. If you
have a 64+ processor Ultra Enterprise 10000, or whatever it's current
equivalent is, the fine grained locking is a big win. If you only have 3 or
fewer processors the locking costs more than the time saved by reduced lock
contention. At least one MPI implementation is also guilty of sacrificing
performance on small systems, like the systems the many people have access to,
at the altar of scalability.
 
I think it would be a mistake for Linux to follow the policy of sacrificing
performance on small systems just for scalability to vast numbers of processors,
which practically nobody using linux has. Scalability improvements that also
help small systems should be pursued instead, for example the O(1) scheduler.
If the lack of scalablity to vast numbers of processors is an issue for you
then you can presumably afford to buy solaris, unicos max, or whatever.
 
The spin lock deadlock issue is probably best resolved by a simple, well
maintained, list of spinlock in increasing or decreasing order. Deadlock is,
provably, avoided if you always take locks in the same order everywhere, which
should be moderately easy given such a list. I, perhaps forunately, am not in a
position to construct or maintain such a list. Victims^H^H^H^H^H^H^Hvolunteers
who are in a position to do so should probably file their application on the
linux kernel mailing list.
 
P.S. I do have some (paper) claims to knowledge of parallel systems. Just when
it was freshly minted parallel systems went out of fashion :-(
 
--
Duncan (-:
"software industry, the: unique industry where selling substandard goods is
legal and you can charge extra for fixing the problems."
 


(Log in to post comments)

Scalability is a double edged sword...

Posted Jul 18, 2002 17:50 UTC (Thu) by iabervon (subscriber, #722) [Link]

Most Linux systems are still single-processor, and the whole locking setup handles the UP case nicely, regardless of which lock you use.

When fine-grained locking doesn't mean holding more locks, but rather holding a different lock from what different code might hold (e.g., separate locks for different filesystems, instead of one for all), it's better for small SMP, because it removes cache line contention (as well as a bit of lock contention).

The BKL should be replaced with other locks whenever possible, because its semantics is odd. Of course, few if any people know in a particular case how to replace it with other locks for just that reason.

Holding locks for less time is mixed. If you do two operations while holding the lock, it might not be worth dropping the lock in between them (even if it is safe to do so). But if you're taking a lock, calling a function, and releasing the lock, it might as well be done in the function, since that means that the function can take the lock only if it actually needs it, can take a more specific lock, and doesn't have to be documented as requiring the lock. Due to the way SMP support came to Linux (the BKL dropped in at the top and split from there), this situation exists in a lot of places.

Splitting a lock that is used for a number of related purposes into a group of locks where more than one is needed for some common operation is bad for the low end, and should be avoided in mainline Linux. But there's a lot of other lock refinements that are possible.

Linus, in any case, is focused these days on small SMP, and nothing is going to go into the mainline kernel which hurts small SMP without a major increase in the readability of the code (i.e., it would have to decrease the chance of non-obvious locking-related bugs). People who have large numbers of processors will probably want to maintain a lots-of-little-locks patch, but there's no reason to have it in the mainline, because people wouldn't be able to test whether they were improving performance for the high end anyway.

Scalability is a double edged sword...

Posted Jul 19, 2002 16:07 UTC (Fri) by DeletedUser2649 ((unknown), #2649) [Link]

I'm just a dabbler in this area; I've done a little bit of parallel computation, and a decent amount of study of it, both in classes and on my own. You sound experienced.

I wanted to know what you think of McVoy's solution to this problem: minimise the number of spinlocks (perhaps keeping it optimised for a 4-processor SMP, since that's known to work reasonably well), and instead support SMP clusters.

There's a brief article on the bad part of fine-grained locking at http://www.linuxmall.com/news/?1,43. I can't find Larry's article on the solution anymore; it's hard to find, but it's out there.

Opinions? It sounds great to me: seems like the best of all worlds, with more capabilities for everyone.

-Billy

Scalability is a double edged sword...

Posted Jul 19, 2002 17:29 UTC (Fri) by davecb (subscriber, #1574) [Link]

As iabervon said, fine-grained locking doesn't mean holding more locks, but rather holding a different lock from what different code might hold[...] it's better for small SMP

Taking small, well-crafted locks in applications reduces locking bottlenecks as well as the likelyhood of deadlock, however the number of processors, 1 as well as 1000. The arguement is weaker in a kernel on a uniprocessor, but can still be made when the locks are being held by the OS as a surrogate for the application.

Of course, you can tak this too far, and end up with 5 locks where one really should do, but then one gets the deadlock you deserve (;-)).

this means that an effort to get the grain as small as possible, but no smaller, would be appropriate in any of the uniprocessor kernel, the general multiprocessor kernel, or Larry McVoy's cluster-of-small-SMP kernek.

--dave

Copyright © 2002, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds