|
|
Subscribe / Log in / New account

Entering the mosh pit

Entering the mosh pit

Posted May 17, 2017 12:41 UTC (Wed) by epa (subscriber, #39769)
In reply to: Entering the mosh pit by paulj
Parent article: Entering the mosh pit

I was thinking you could generate a megabyte of random data, send it across the ssh link, and use it as a one-time pad. That would cut latency further since you just need to XOR the incoming bytes rather than run AES decryption. As long as any packets don't get lost, I suppose. When the one-time pad is about to run out you can make a new one and again send it via the open ssh connection.

Or am I worrying too much about the latency imposed by AES?


to post comments

Entering the mosh pit

Posted May 17, 2017 12:53 UTC (Wed) by farnz (subscriber, #17727) [Link] (4 responses)

Bear in mind that modern CPUs have dedicated AES assist instructions - this set of slides suggests that you end up using around 2 clock cycles per byte with an optimal implementation, and a latency on the order of 10 clock cycles from first byte in to first byte out. Given the clock speeds on modern CPUs, I suspect that you're going to hit network limits before AES is too slow.

Entering the mosh pit

Posted May 17, 2017 15:59 UTC (Wed) by epa (subscriber, #39769) [Link] (3 responses)

Thanks. I am still living in the past, when I had to patch the openssh source code to add a 'no encryption' mode to get decent latency on my collection of ancient i386 machines...

Ten clock cycles (or even ten thousand) is plenty fast enough to display a keystroke instantly.

Entering the mosh pit

Posted May 17, 2017 16:07 UTC (Wed) by farnz (subscriber, #17727) [Link] (1 responses)

Yeah - AES acceleration instructions mean that your latency is dominated by network latency, not compute latency, by a significant margin - a realistic implementation of AES decrypt using the acceleration instructions is going to add under a microsecond of latency, which is less than the minimum Ethernet latency you're going to see on a real 10G network.

Entering the mosh pit

Posted May 18, 2017 7:08 UTC (Thu) by luto (guest, #39314) [Link]

Even without AES acceleration, you're not going to notice the latency -- AES is quite fast even in software, and mosh is meant to be used by humans.

Entering the mosh pit

Posted May 18, 2017 15:30 UTC (Thu) by flussence (guest, #85566) [Link]

ChaCha20 is a bit faster than AES (4-15cpb), which is one of the reasons it's the default in OpenSSH now.

(Not fast enough to make me give up using HPN-SSH, mind you...)

Entering the mosh pit

Posted May 17, 2017 15:25 UTC (Wed) by itvirta (guest, #49997) [Link] (1 responses)

The possibility of packets getting lost is quite severe when the connection is done over UDP, _and_ when one of
the useful use-cases is that of roaming or sleeping laptop clients. Besides, what would you do when your megabyte runs out?
As for the latency of AES, that's what your SSH connections are encrypted with, too.

Entering the mosh pit

Posted May 17, 2017 15:55 UTC (Wed) by epa (subscriber, #39769) [Link]

Yes, the ssh connection is encrypted and has high latency. The one megabyte block of random data can be sent as a one-off on connection, and topped up in the 'background' when the terminal session is idle for a second or two. It doesn't matter that there is high latency for sending that data since it isn't needed immediately. This scheme would be inappropriate for a connection with lots of data going over it all the time, but it could work pretty well if what you are sending is tiny updates (individual keystrokes, or at most a screenful of text), which come at fairly infrequent intervals (human typing is slow), but they need to be processed with as little latency as possible when they do happen.

I suppose that if the UDP packets have a sequence number and a fixed length, you can still use the one-time pad to encrypt them (if a packet is lost, then that bit of the one-time pad is wasted too).

Others have pointed out how on modern CPUs AES is fast, so it may be a non-issue. (Although I would point out there is a difference between the average speed for decrypting a large block of data, and the speed if you are doing a single byte at a time. I doubt that decrypting just one byte on its own can be done in two clock cycles. But even if it takes a hundred thousand cycles that's still fast enough, on a modern CPU, to display the keystroke 'instantly' to a human user.)

Entering the mosh pit

Posted May 17, 2017 20:07 UTC (Wed) by cgull (guest, #115681) [Link]

You're worrying too much. Mosh's bandwidth needs are quite small, and even on a system without AES hardware, the crypto computation cost is pretty much in the noise compared to the virtual terminal emulation and state keeping. It's been a while since I benchmarked this, but I recall crypto being at most 2% of CPU usage in a profile.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds