Posted Jul 22, 2004 18:56 UTC (Thu) by avik (subscriber, #704)
[Link]
Sure. It's a clustered filesystem (a hot topic these days). It's implemented in userspace, so efficient and copyless I/O are essential.
We're using Red Hat's Enterprise kernel, which provides aio poll and blockdev read/write. We hacked aio readv and writev (a must for performance) and copyless aio udp sendmsg. We'll probably do aio tcp sendmsg as well.
We would dearly love the receive counterparts as well, especially if they are copyless, though more sophisticated networking hardware is probably needed.
I'd like to emphasize that aio poll is absolutely crucial for mixing networking and blockdev I/O, at least until full aio networking support is available.
Avi
One happy user
Posted Jul 23, 2004 23:25 UTC (Fri) by giraffedata (subscriber, #1954)
[Link]
What I'd like to hear is how it's better than buffered I/O and/or multiple threads doing synchronous I/O. That's what it's competing with.
One happy user
Posted Jul 24, 2004 7:10 UTC (Sat) by avik (subscriber, #704)
[Link]
Buffered I/O is out of the question. It involves copying which reduces throughput. Synchronous reads, buffered or not, will block, throwing parallelism out the window. We do our own caching (distributed filesystem), so the kernel cache is very small and ineffective; it only adds overhead,
Our previous solution was nonblocking network I/O, plus slave threads doing synchronous direct disk I/O. This had two problems:
- less efficient due to context switches and scheduling latencies - no way to do copyless network I/O.
The last point is only partially addressed by aio, but it's better than nothing.