I don't think short connection = low latency, long connection = high throughput is a good idea.
Due to the 3-way handshake, TCP connections which only transfer a small amount of data have to pay a heavy latency penalty before sending any data at all. It seems pretty silly to ask applications that want low latency to spawn a blizzard of tiny TCP connections, all of which will have to do the 3-way handshake before sending even a single byte. Also, spawning a blizzard of connections tends to short-circuit even the limited amount of fairness that you currently get from TCP.
This problem is one of the reasons why Google designed SDPY. The SPDY web page explains that it was designed "to minimize latency" by "allow[ing] many concurrent HTTP requests to run across a single TCP session."
Routers could do deep packet inspection and try to put packets into a class of service that way. This is a dirty hack, on par with flash drives scanning the disk for the FAT header. Still, we've been stuck with even dirtier hacks in the past, so who knows.
I still feel like the right solution is to have the application set a flag in the header somewhere. The application is the one who knows. Just to take your example, the ssh does know whether the input it's getting is coming from a tty (interactive) or a file that's been catted to it (non-interactive). And scp should probably always be non-interactive. You can't deduce this kind of information at a lower layer, because only the application knows.
I guess there is this thing in TCP called "urgent data" (aka OOB data), but it seems to be kind of a veniform appendix of the TCP standard. Nobody has ever been able to explain to me just what an application might want to do with it that is useful...