LPC: Life after X

Posted Nov 8, 2010 2:31 UTC (Mon) by drag (guest, #31333)
In reply to: LPC: Life after X by dlang
Parent article: LPC: Life after X

> As a result, even for local graphics, the main cpu should be describing what it wants to be displayed and sending that description to the video card rather than just sending the image to be isplayed.

This, on a side note, is also why often when you're doing benchmarks of graphical toolkits the software rendering of X ends up being faster then the hardware accelerated version it's still advantageousness to use the GPU rendering IF you can do the majority of the rendering on the GPU. Certain toolkit microbenchmarks will often show that CPU rendering is faster in some things then GPU rendering. The GPU just sucks at certain things, but ideally you want to use the GPU 100% of the time to avoid the multiple trips over the PCI Express buss. Each time you have to send texture data across the buss your just burning hundreds of thousands of GPU and CPU cycles just waiting for data to be pushed over.

you can imagine the huge penalty you have if you do, say, text rendering in software, but do the rest on the GPU. Even if the GPU rendering was a dozen times slower, in the real world GPU will still win.

Luckily AMD and Intel are trying to simplify things quite a bit with putting CPUs and GPUs on the same hunk of silicon. No need to flush textures back and forth if your sharing the same memory. :) But even then making proper use of the GPU with software will yield huge improvements in efficiency and performance.

LPC: Life after X

Posted Nov 8, 2010 14:02 UTC (Mon) by nix (subscriber, #2304) [Link]

This is just a re-expression of a problem graphics developers have had for fifteen years or more, ever since video cards started getting dedicated RAM with relatively access latencies from the CPU: you can store stuff in VRAM and it's really fast to manipulate with the GPU and to display, or you can store it in main memory and bash it with the CPU and it's much slower to display, but if you mix the two, you get incredible sloth. Back when the offscreen pixmap cache didn't have a defragmenter (pre X.org 1.6.0) it tended to get too fragmented to put any useful amount of text into... and text scrolling, on a Radeon 9250, then took several *seconds* per screen. That was entirely because of repeated CPU<->VRAM data transfers. They're *slow*.

(Of course, KMS should fix all of this by giving us a proper memory manager for VRAM-plus-main-memory. It doesn't seem to be there yet, though: I still see occasional scrolling slowdowns when the pixmap cache gets too fragmented and the defragmenter hasn't kicked in yet.)