Giving Rust a chance for in-kernel codecs

Posted Apr 29, 2024 19:50 UTC (Mon) by stormer (subscriber, #62536)
In reply to: Giving Rust a chance for in-kernel codecs by leromarinvit
Parent article: Giving Rust a chance for in-kernel codecs

I'm quite suspicious about what you mean by "having in kernel codecs". Stateless decoders refer to some hardware that does not track the sate of the decoding process. The benefit of such hardware represent a special processing core that can be scheduled to different tasks (task being streams in the context) in a very flexible way. In contrast, the stateful kind of decoding hardware needs to maintain the state of each concurrent streams and this is often limited by firmware and co-processor resources. The scheduling usually cannot be adapted to any third party constraints (consider cgroup and other kind of quotas).

All in call, the layer placed in front of these drivers through V4L2 Stateless Decoder interface does not constitute an "in-kernel" codec". The hardware implements the heavy processing, userspace implements the high level decoding logic and parsing. The responsibility of such Linux kernel driver is to ensure that the parsed parameters and state is valid and matches the pre-allocate resources size. This isn't something Rust can solve and will ever solve, this is pure logic and logic can be broken even in Rust. For each codec, specific stream parameters passed to the hardware imply specific auxiliary or image sizes. It is the responsibility of the Linux kernel to ensure that the hardware will not overflow these for a given decoder command. As this is a mix of code and hardware, Rust brings nothing here.

Though, in order to adapt to all kind of hardware, we are forced to implement small bits of the codec spec. This is implemented in the form of different codec specific libraries. For H.264 and HEVC, we transform and reorder references lists to match each hardware requirement. For VP9 and AV1, we need to post process some of the probability tables in order to combine bitstream probability updates with observations made during the decoding process. Just a quick read at these C helpers, you'll notice its made of tones of C arrays which if overrun will overwrite each other silently. I do hope our implementation is right and safe in C, but real confidence could come from the guaranties offered by the Rust language/compilers.

Another study that Daniel has been doing is the inner part of the stateless hardware programming. This is not very specific at this point, but this kind of hardware have hundreds of variable sized parameters packed into registers at different bit location. What the study revealed is that this code often misses some integer sign and sizes consideration. This may lead to miss-programming of the hardware in corner cases, errors that would generally be prevented by the Rust compiler. I personally think we could do more then just safety, and improve how we program these register with the Rust language, but at this step, this pure choice and preference.

Giving Rust a chance for in-kernel codecs

Posted Apr 30, 2024 8:32 UTC (Tue) by leromarinvit (subscriber, #56850) [Link] (1 responses)

Thanks for the detailed explanation! Somehow, I misread the article and was left with the impression that v4l2 contains at least significant parts of full codec implementations, to support the parts that some decoders don't implement in hardware. That's what I commented on, but I see now that what looked like a strange design to me was simply a misunderstanding.

Like I said, so far I've never had the need to handle video in my own code, so I know next to nothing about how all the components work and interact.

Giving Rust a chance for in-kernel codecs

Posted Apr 30, 2024 9:38 UTC (Tue) by farnz (subscriber, #17727) [Link]

Note that even without the little quirks, stateless codecs will always need something to manage three chunks of state for them:

Position in the bitstream. Something needs to track how far through the bitstream you are, and avoid either skipping bits that the stateless codec needs to see, or sending it the same bits repeatedly when it doesn't need them again.
Picture reordering. Video codecs can predict the "current" picture from both past pictures, and future pictures. If you have pictures in output order 1 2 3 4, where pictures 2 and 3 can use picture 4 as a reference, the bitstream will contain pictures in the order 1 4 2 3, and something needs enough state to rearrange that into 1 2 3 4.
Reference picture buffering. Video codecs aren't allowed to refer to arbitrary pictures when they predict the current picture; they're only allowed to refer to "reference pictures". Every codec has rules for when a picture enters the set of reference pictures, and when it leaves, and the stateful wrapper has to ensure that the stateless decoder has the "right" set of reference pictures available to it.

And then you get into more complicated things like stormer described, but also things like MPEG-2 having per-sequence and per-picture state, which needs to be available to the decoder with every slice it's decoding. The wrapper thus has to understand enough of the bitstream to know what state the decoder needs to be given with each slice it's decoding.