LWN.net Logo

The April, 2007 Netcraft Web Server Survey

The April, 2007 Netcraft Web Server Survey

Posted Apr 3, 2007 8:28 UTC (Tue) by jmtapio (subscriber, #23124)
Parent article: The April, 2007 Netcraft Web Server Survey

It is no wonder that lighttpd has been gaining so much in popularity. Apache does seem to use unbelievable amounts of memory on some configurations nowdays. For example one setup that I am involved with:

$ ps aux | head -1; ps aux | grep apache2 | head -1
USER   PID %CPU %MEM    VSZ   RSS TTY STAT START TIME COMMAND
root 11964  0.0  3.5 154740 72856 ?   Ss   06:25 0:05 /usr/sbin/apache2

And that is just the controlling process. The actual request serving processes can in this setup use even more resident memory. And there are no memory caches or something like that in use, just a typical setup with cgi-scripts and php.

I can understand why Zope and JBoss and the like consume so much memory, but Apache I can not understand. If the entire server used that much memory, it would certainly be okay, but these figures are per process memory consumption with the prefork model.

Anyway It would seem that the best model for web servers is to use the httpd as a kind of web router that serves only the simple requests in-process and communicates with external processes for every complicated request. Although Apache is very modular and I like the flexibility and configuration, it still seems very bloated for acting as a "web router". That may just be the cause why people are looking for alternatives.

Competition is good and I hope lighttpd will gain more popularity.


(Log in to post comments)

The April, 2007 Netcraft Web Server Survey

Posted Apr 3, 2007 12:27 UTC (Tue) by eklitzke (subscriber, #36426) [Link]

I probably don't need to point out that accounting for memory usage is a tricky business, and that the numbers given by ps and other programs aren't useful without more information, but here goes anyway.

Right now my home server is using 137 MB of RAM (reported by free -buffers/cache). ps aux would have me believe that mysqld is using 127 MB of RAM, and that each apache process (and there are about a dozen) is using about 24 MB of RAM (these are the VSZ columns). Obviously a lot of memory is being overreported. If I look at the numbers with ps -lyef I see that the SZ column for mysqld is about 31 MB, and 6 MB for each Apache process (4 for the master), which seems more in line with what I would expect the numbers to be.

I'm not really sure how the numbers for MySQL got so high, but in the case of Apache the explanation is fairly simple. In the preforking model the main process starts up and then forks a bunch of times to allow its children to handle new requests. On a server that isn't too loaded, most of the forks will end up sharing nearly all of their memory. If I go off of the numbers in the SZ column, this is basically confirmed -- each child only consumes about 6 MB of new memory, so altogether Apache is being fairly reasonable.

With respect to communicating with external processes for complicated requests, this is of course very slow. The primary reason that you use Apache is exactly because it doesn't need to do this. I think that you will find that mod_{php,perl,python} is much faster than any sort of CGI alternative, and of course Apache's model enables it to scale with concurrency very, very well. If you don't have a lot of traffic and can deal with a slightly slower HTTP server, you'll definitely find that lighttpd and the alternatives are a lot more friendly on memory, but this of course comes with a definite cost in terms of overall speed and the ability to scale with concurrency.

Cost in speed?

Posted Apr 3, 2007 13:03 UTC (Tue) by nigelm (subscriber, #622) [Link]

> this of course comes with a definite cost in terms of overall speed and the ability to scale with concurrency

This is making the assumption that the apache encouraged method of pushing everything into one process (as in mod_p*) makes things faster and more scalable. With lighttpd you tend to use fastcgi for persistant processes instead, which makes it easier to scale the number of processes used for dynamic content appropriately (they can also be on different machines if you wish), whilst still allowing static content to fired out by the main server so avoiding the apache tendancy to split out static and dynamic servers.
Additionally since the dynamic stuff is running in a different process, you can make it run as a different user - better security separation. All at a speed which is at least as good as apache.

BTW there are some additional tweaks in lighttpd 1.5.x which promise to make external processes even faster.

The April, 2007 Netcraft Web Server Survey

Posted Apr 4, 2007 2:09 UTC (Wed) by njs (subscriber, #40338) [Link]

>With respect to communicating with external processes for complicated requests, this is of course very slow.

Only if you do it wrong -- you seem to be assuming CGI. The bottleneck in CGI isn't the IPC, and it isn't even the fork(), it's the exec() -- mapping in the executable, dynamic linking, maybe byte-compiling, getting through main() to your actual code, etc., on every request.

Any setup that has apache talking to a persistent process, rather than loading a whole new one on every request (FCGI/SCGI/mod_proxy/...), will scale basically as well as mod_p*. The IPC overhead is basically a bounce through a memory buffer, it's not even radically different from the sort of work that might happen just moving your data out of mod_p* into apache's C layer.

You shouldn't see any cost to concurrency either, just the opposite... for instance, if you have multiple areas of your website (or virtual domains, or whatever) that are run under different systems (say one part using a particular CMS, another a different blogging system, whatever), and each gets about the same number of hits, then you can have 10 CMS processes and 10 blog processes to handle both, instead of 20 apache+cms+blog processes -- a total of 2x the memory usage. And it's worse if they aren't getting the same number of hits, because you still have to pay (total number of hits)*(total number of systems) in memory, while in the more decoupled design you can have, say, 15 CMS processes and 5 blog processes...

The April, 2007 Netcraft Web Server Survey

Posted Apr 4, 2007 11:09 UTC (Wed) by drag (subscriber, #31333) [Link]

In any benchmarks I've seen, even with FastCGI and SCGI, Apache has always been able to scale quite a bit higher with it's *mods then lighttpd with SCGI or whatever.

But usually it's artificial benchmarks designed to stress the system as hard as possible with a script generating 'hello world' or something like that.

So it's pointless for real life were you'd have your database or bandwidth limiting you.

I think the way it seems right now is that if you want something that is small were memory usage is important or something like that (say with a virtual private server) then lighttpd + fastcgi/scgi is what you'd want to use.

If you want to have something that is portable and fast with minimal dependancies and easier configurations then lighttpd is good also.

But if you want something to scale to the upper end of things then Apache with *mods is generally the way to go. Not always, of course. Just generally.

There are a lot of projects that make websites for home users or specific business stuff. Like configuration interfaces for firewalls, print/file servers, or stuff like bittorrent clients or corporate intranet forums. I think for that then lighttpd is a boon. Apache is just too huge of a hammer for such small nails.

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds