The leap second bug
Posted Jul 4, 2012 9:20 UTC (Wed) by farnz
In reply to: The leap second bug
Parent article: The leap second bug
The device in question did encoded video stream conformance checking. We had a total of four interesting threads; the statistics gathering thread, the automatic reaction thread, the decoder thread, and the render thread, and three input feeds.
The statistics gathering thread gathers interesting information about the stream, and stores it for the automatic reaction thread and the render thread to use. This can take up to 4ms per frame per input, depending on the instantaneous complexity of the inputs (it does partial decode of video and audio to approximate some interesting measures of the video and audio), but normally takes about 1ms per input.
The automatic reaction thread applies a set of business rules against the statistics, and can trigger external systems to react to an out-of-bounds condition, plus indicates to the render thread that it should show an alarm state to any human user. This takes no more than 1ms with the most complicated rules permitted.
The render thread takes 1ms to render the statistics and any alarm and a further 4ms to update the video box.
The decode thread takes up to 4ms to decode each frame of a selected input. As decoded video is only for presentation to a user, it is considered low importance.
When you add it up, the statistics take 12ms. The reaction thread gets us to 13ms. The render thread needs 1ms if not showing video, for 14ms total, or 5ms if showing video, for 18ms total. The decode thread can add another 4ms to that (22ms total), and our deadline is 16.66ms per frame. We are 6ms over budget for a single frame, in the worst case.
We took this to our product managers, and were told that as long as the automatic reactions happen every frame, we would be OK if the UI was late (render and decode threads), but that they'd want to see at least 1 in every 4 frames of video. This was because the automation was expected to run 24/7 as a set-and-forget system, but the UI would be something only used by some customers at critical times, and could be slow to update.
We handled this by making the reaction thread highest possible priority; the stats thread is the next priority down, as it's more important to have the automated stuff happening than it is to keep the user updated (we expected most people to treat the product as set-and-forget). The decode thread is the next highest priority, as we want to complete a frame decode once we've started it, so that there is some video to display - we don't want to be unable to ever decode a frame in time to display it. The render thread runs at a low priority. The mutex then permits the render thread to release the decode thread when the render thread has enough time left until its next deadline that it should be safe to decode a frame; if the render thread doesn't reach this point in 3 frame times, the decode thread will start anyway, and claim the CPU (delaying the render thread, as it's higher priority, and this is all SCHED_FIFO scheduling).
Given the constraints, how would you have implemented it?
to post comments)