Local network latency is very low, much lower than hard disk latencies. Throughput is about 100 MB/s, as fast as fast hard disks. The only reason it's slower is because there's a slow local fs at the other end.
Fundamental issue is that programs use system calls to communicate with the outside world, and most of those system calls deal (sometimes indirectly) with files. For a network filesystem client going through the kernel, then to userspace and back again is just a stupid way of doing something relatively simple.
To sum up, network filesystem clients are in-kernel for all the same reasons why normal filesystems are in-kernel. For network fs servers it's a slightly different trade-off.