Why couldn't there be a negotiation that describes the optimal pixel format so that GUI programs could render their text optimally?
Also text doesn't change all the time, so it probably takes far less bandwidth to deal with than video in average, because you only have to transmit the pixels once, and then user spends a lot of time reading. If you know you're dealing with text (say, because the user told you so) you could disable lossy compression schemes too.
Favorite rant of mine:
Not that I expect introduction of Wayland to result in good text rendering on Linux. The text layering on window image has always been treated as a gamma=1.0 alphablending problem on Linux, the end result being awful color fringing and varying weight on diagonals. These problems are not going to go away until someone finally designs it right from day one.
All I can hope for is that by complaining about this eventually someone will wake up and design a text layering pipeline that can do gamma-corrected alphablending. Until that way, our fonts will continue to suck. There has been some hope recently with sRGB surface support specced in OpenGL, so I can only hope and beg that this flag will get turned on whenever a bitmap representing text is about to get combined with underlying graphics.