User: Password:
Subscribe / Log in / New account

A niggle about linearity

A niggle about linearity

Posted May 9, 2011 18:24 UTC (Mon) by davecb (subscriber, #1574)
Parent article: Scale Fail (part 1)

You write "Scaling an application is an arithmetic exercise. If one user consumes X amount of CPU time on the web server, how many web servers do you need to support 100,000 simultaneous users?"

And it is arithmetic (linear), until it isn't.

When you hit a saturation point, the response time stops growing linearly with load, and increases insanely. The response time heads skyward like a homesick angel, and even the sysadmin thinks it's hung. If you plot it, you get a curve that looks a lot like a hockey-stick, with a short horizontal blade, a nice gentle bend... and a long straight handle that just keeps going up.

You add linear amounts of resources to get it back under control, of course, so the cure is arithmetic. The response time and the customer response, however, are hyperbolic (;-))


(Log in to post comments)

A niggle about linearity

Posted May 12, 2011 0:45 UTC (Thu) by jberkus (subscriber, #55561) [Link]

Yeah, performance problems are generally thresholded. That is, they're linear until they hit a limit, and then things fall apart.

However, you can get an estimate of when you're going to hit those thresholds with some fairly simple arithmatic. It's a shame more people don't try.

A niggle about linearity

Posted May 13, 2011 2:59 UTC (Fri) by raven667 (subscriber, #5198) [Link]

I agree, I wish more people understood databases and storage IO at least as well as many understand network IO. The recent rediscovery of buffer bloat and latency for example hasn't seemed to happen for storage, people talk like MB/s is the only stat that matters when it is often the least interesting.

I've struggled with this kind of issue in the past when trying to understand the performance issues and needs of a large in-house that I supported for many years. I got it wrong many times and the simple estimations that might have helped only look simple in retrospect. There is a lot of pressure to treat databases and storage as a black box until you ask more from it than it can give.

A niggle about linearity

Posted May 13, 2011 13:13 UTC (Fri) by andrewt (subscriber, #5703) [Link]

Be careful, as CPU utilization does not always correlate to work done. In fact, CPUs with hyper-threading have quite a surprise when they cross the 50% utilization point -you might get 25% more transactions as the CPU goes to 100%, and not another 100% transactions as one would expect from simple arithmetic. Even without hyper-threading, there's enough other things to bust the whole linearity thing like cache warmth, etc.

A niggle about linearity

Posted May 13, 2011 20:05 UTC (Fri) by dlang (subscriber, #313) [Link]

that only works if you know what those thresholds are.

in many cases they are not where you expect them to be (the hyperthreaded cpu utilisation is one example, locking overhead with multiple processors is another)

frequently there are factors in play that you don't know about, and the result is that until you test it to a particular load, you have no way of knowing if the system will reach that load.

interpolation (guessing how things work between measured points) is fairly reliable

extrapolation (guessing how things will work beyond measured points) is only reliable until some new factor shows up.

A niggle about linearity

Posted May 22, 2011 22:29 UTC (Sun) by rodgerd (guest, #58896) [Link]

Q: Why is our new version of $APPLICATION using a ton more CPU on the database?

A: Because when I run the new query you put in from a simple script, it uses a whole CPU and takes a second. Your performance limit is $NUMCPU/second, at best.

Q: [goes away to redo query]

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds