Date: Tue, 30 Jun 1998 10:27:51 -0700 (PDT) From: Linus Torvalds <torvalds@transmeta.com> To: "Jeffrey B. Siegal" <jbs@quiotix.com> Subject: Re: Thread implementations... On Tue, 30 Jun 1998, Jeffrey B. Siegal wrote: > > One thing that bothers me about this is that it sending out full-sized packets > seems like it would involve copying the data in order to build these full-sized > packets. If memory is slow and the network card is smart and efficient, it might > be cheaper to just send smaller packets. Thoughts? Note that regardless of how fast and smart the network card is, if your actual _network_ is faster than your memory copy speed it is time to either throw away your computer and start over, or just admit to yourself that whatever you do your computer is never going to be very good at doing web-serving. What I'm trying to say is that "cheaper" is not immediately obvious. Yes, you may spend less CPU cycles on it. But if the network performance suffers, then you just lost something, and the "cheaper in CPU" approach actually became "more expensive in real life". I personally don't actually think that web-serving should ever be CPU-bound when it comes to the actual networking part. CGI, yes. But if your web-server is so CPU-bound by just trying to keep up with the network and disk that one system call per transfer makes a difference, then something is seriously wrong with your setup. Note: this is not denouncing using scatter-gather on the network card etc, and trying to use less copies. sendfile() is actually the much nicer interface for that, simply because suddenly all the problems that delayed writes with zero-copy had with the UNIX semantics of "write()" no longer exist - because sendfile() doesn't have to have the unix semantics of writes. [ For those of you who haven't been in on that particular discussion: in many cases you'd like to just give the network card a series of physical addresses, and tell it "take these, send them out as a TCP packet, and tell me when you're done". And then go on to serving the next packet, knowing that the network card will do the actual work in the background. This is really hard to do with "write()", because if you return from the write() system call before the network card has finished everything (which is what you'd like to do in order to overlap calculations with communication) then you suddenly have lots of problems with coherency and making sure the user doesn't modify the buffer until everything has been sent from the old buffer. That's whay the UNIX semantics require for "write()". However, for "sendfile()" is just makes perfect sense to allow this. The "sendfile()" thing in effect asks the system to send out the file - it doesn't ask the system to send out some specific buffer that people can scribble on at will. Suddenly the OS has much more control, and at the same time we have fewer rules too (we might for example say that "hey, if somebody is in the middle of modifying this file, then who knows whether we'll send out old or new data?" - we don't have to keep the thing coherent to the same degree we have to keep a user buffer coherent. As such, it suddenly becomes much easier to do clever tricks like background sending and letting a network card really shine. I'm not claiming it is easy, but I'm claiming it is easiER. ] I understand that people are nervous about adding new system calls, and especially something that is most well-known in the NT community. But we've shamelessly stolen from others - clone() was very much influenced by plan-9, as was the /proc filesystem. Let's not be picky about where the stolen ideas come from.. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu