LWN.net Logo

Realtime group scheduling doesn't know JACK

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 20:19 UTC (Tue) by cmccabe (guest, #60281)
In reply to: Realtime group scheduling doesn't know JACK by jimparis
Parent article: Realtime group scheduling doesn't know JACK

> Right, so you're saying it's latency. But why is this such a big deal for
> audio? The latency between a user's input and the response comes up
> everywhere in an interactive system (moving the mouse and seeing the arrow
> move, clicking to close a browser tab, pressing spacebar to pause your
> mplayer video, shooting a gun in a video game, etc) and yet you don't hear
> every other developer complaining about latency. Why is the latency
> between audio input -> audio output be any more difficult than the latency
> between USB input -> video output?

That's a very good question. Part of the answer is that ordinary users DO care about latency. Part of what held back garbage-collected languages for so long was users' annoyance with the long and nondeterministic delays. For example, Java on the desktop in the nineties was just no fun at all.

But at the end of the day, for your average Joe, occasional latency spikes are not a big deal. You set your xmms buffer to a big size and get on with your life. If a latency spike happens when you're playing a game or clicking 'save' in your spreadsheet program, that's mildly annoying, but hardly a major problem.

For audio professionals, an occasional latency spike is a real problem. If you're in the middle of composing something on a MIDI keyboard and the computer randomly inserts a long delay, making it sound wrong, you're going to be annoyed. You're going to lose work and time. People can notice audio delays that are more than about 10 ms.

You have to realize that a lot of compositions are done in layers so that you have something playing, and you add another track to it. In those cases, it's critical that the time between pressing the key on the piano keyboard and registering it be kept to a minimum. Buffering will not help you there.

This question is a little bit like asking "I pulled out MS NotePad and started typing some code. It seemed to work fine! Why do programmers need all these fancy editors with macros and whatnot?" Well, they have different needs.


(Log in to post comments)

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 21:18 UTC (Tue) by jimparis (subscriber, #38647) [Link]

> If you're in the middle of composing something on a MIDI keyboard and the
> computer randomly inserts a long delay, making it sound wrong, you're
> going to be annoyed. You're going to lose work and time. People can
> notice audio delays that are more than about 10 ms.

That's assuming that Linux is in charge of measuring the timing between MIDI keypresses, and therefore a delay in Linux will add a delay in the music. Why is that the case? Is MIDI hardware really that bad? I would hope the MIDI hardware sends data to the computer that looks like:
1. Pressed C at 12:00:00.102
2. Pressed D at 12:00:00.112
3. Pressed A at 12:00:00.122
4. Pressed B at 12:00:00.132

If the computer is busy for 25ms and misses #2 and #3, that's fine, it will read those events out of a buffer when it returns. The data will still show a keypress every 10ms as intended.

If MIDI hardware is exceptionally stupid and requires the computer to perform all timings, then it sounds like there's a market for a very simple hardware enhancement that would avoid this sort of issue.

> You have to realize that a lot of compositions are done in layers so
> that you have something playing, and you add another track to it. In
> those cases, it's critical that the time between pressing the key on the
> piano keyboard and registering it be kept to a minimum. Buffering will
> not help you there.

In this case I'd argue that the output and input streams need only to be _synchronized_, not that there is zero-latency between them. Assume you have a huge buffer for audio output, and a huge buffer for the MIDI input. The computer should be able to know that "output sample #12345 made it to the speaker at 12:00:00.123" and "keypress C on the piano was registered at 12:00:00.123" based on either measured latencies, reported timestamps from the relevant devices, or sample rate info plus manual adjustment (like Rock Band's calibration screen). Even if the computer is busy and doesn't actually register the keypress until 12:00:00.456, you can still with 100% accuracy match the keypress to the proper place in the music, so that when you go to play it back later you can put it exactly where the musician intended.

It really seems to me like the approach of audio guys is to try to get latency to zero, rather than accepting variable latency and simply accounting for it with larger buffers and proper timestamping. Zero latency seems like a losing battle because you'll never reach it, so for use cases like you described where you can correct for the effects of latency, that seems like a better solution.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 21:35 UTC (Tue) by jebba (✭ supporter ✭, #4439) [Link]

> It really seems to me like the approach of audio guys is to try to get latency to zero, rather than accepting variable latency and simply accounting for it with larger buffers and proper timestamping

Uh, no. In fact, you can't do jackd with 0 latency and that isn't the goal. RTFS. Your scenario is FAIL and it is clear you have never used it in production. Why not actually listen to the people that actually have done this?

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 21:36 UTC (Tue) by sfeam (subscriber, #2841) [Link]

You are clearly speaking from the perspective of someone who is not a musician. If you're the one playing the keyboard, you really do notice and suffer from a lag in response on the order of 10-20 msec. Buffering in the computer doesn't help; you'd need the buffering to happen in your brain. Yes, you can train yourself to deal with it up to a certain point. It's like playing as part of a live ensemble split between two sides of a large stage. You are focusing on the path after the signal enters the computer, forgetting that the human brain is on both the input and output side. The musician needs what the ear hears to be in sync with that the hands play.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 21:44 UTC (Tue) by jimparis (subscriber, #38647) [Link]

You're still missing the point that the latency only matters if the computer is both taking input (e.g. midi keyboard) and also creating the output (e.g. synthesized audio). Surely you would agree that latency doesn't matter at all if you are e.g. recording singing from a microphone to accompany a guitar track you are playing back. They can be perfectly synchronized in software to match what you heard while singing.

I think my only real error was underestimating how frequently people need to either apply live digital effects or otherwise play live synthesized audio from live external triggers, which sounds like "always" from what people are saying here.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 21:50 UTC (Tue) by sfeam (subscriber, #2841) [Link]

"Surely you would agree that latency doesn't matter at all if you are e.g. recording singing from a microphone to accompany a guitar track you are playing back."

Next time you go to a performance with amplification, pay attention to what happens during the sound check. Pay particular attention to the small monitors near the front of the stage that point back at the performers so that they can hear the mix of what they are singing into the mic with the rest of the input. Chances are that you'll hear the performers request more or less feedback from the monitor, because getting it right makes it a whole lot easier to perform. You can't just sing, or play, into a vacuum and somehow know that it's coming out right. The feedback of your own mechanical actions, voice or fingers, to your ears is crucial.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 23:21 UTC (Tue) by chad.netzer (✭ supporter ✭, #4257) [Link]

"Surely you would agree that latency doesn't matter at all if you are e.g. recording singing from a microphone to accompany a guitar track you are playing back."

Nope. Singers like to hear some of their voice fed back to them. It's not uncommon for unconstrained musicians to move to the corner of a room, to best hear the reflection of their voice and instrument (at low latency). If they are using headphones, the audio equipment has to provide that feedback loop. It is not an "offline" operation, where latencies could be ignored (or corrected).

Realtime group scheduling doesn't know JACK

Posted Dec 22, 2010 0:35 UTC (Wed) by baldridgeec (guest, #55283) [Link]

Composing, even electronica, is also "live" unless you're doing it in a MOD editor from the 90s.

When I'm playing my electronic drum kit using Hydrogen to provide the samples, BELIEVE ME, there is a very very noticeable difference between hearing your snare strike 10 ms after your stick hits the pad and 50 ms after your stick hits the pad. 50 ms is still playable if it's consistent, and if you practice at it a bit - some people use Windows, after all, so it obviously works, even if it's not ideal. :)

But 100 ms is not usable at all. A tenth-second gap means there is no relation whatsoever between the part of the phrase in your head (that is currently being conveyed through your hands to the equipment) and the part of the phrase coming into your ears and being interpreted by your brain as "what I'm playing right now." You can't force half of your brain to work 1/16 note in the past at the same time as you play what you need to for the present. It just won't work.

Realtime group scheduling doesn't know JACK

Posted Dec 22, 2010 1:51 UTC (Wed) by nix (subscriber, #2304) [Link]

Actually you *can* train yourself to do it: I can think of several works in which you have to (the Phase works by Steve Reich, for example). But they're rare, and it's difficult, and you really wouldn't want to do it for everything.

Realtime group scheduling doesn't know JACK

Posted Dec 22, 2010 2:04 UTC (Wed) by baldridgeec (guest, #55283) [Link]

(Just looked them up, never heard of them before, interesting! I'll have to find some recordings)

From what it says on Wikipedia, it's played as a duet - i.e. the music you play is still in time with itself; your part is in phase with the other part.

Still hell to play, yeah, but it would be pretty much impossible if the sound from your own piano were to come at you with an audible delay. It would be easier to play if you were deaf - at least then nothing would interfere with the rhythm in your head.

Realtime group scheduling doesn't know JACK

Posted Dec 22, 2010 12:00 UTC (Wed) by nix (subscriber, #2304) [Link]

Actually one of the ways to play it *is* with headphones that pick up what you're hearing and rebroadcast it to you delayed by just enough (a changing delay): then all you have to do is keep what you hear from becoming phased! (Another common way, and probably the more effective one, is to play the second part without a tape of the first part at all, just with a metronome beat defining when the first part's beats are, then mix the two together later. But you can't do that live and it feels obscurely like cheating. Live performers generally have to do it without any artificial assistance at all, and that *is* hard.)

Realtime group scheduling doesn't know JACK

Posted Dec 22, 2010 0:39 UTC (Wed) by jebba (✭ supporter ✭, #4439) [Link]

> Surely you would agree that latency doesn't matter at all if you are e.g. recording singing from a microphone to accompany a guitar track you are playing back.

Not if you are sending that voice back to them so they can hear it in the mix, which of course they have to. Send it to the singer with high latencies, and he looks at you like "WTF??!" because it throws them off rhythm, which is kind of important.

Realtime group scheduling doesn't know JACK

Posted Dec 23, 2010 11:05 UTC (Thu) by jwakely (subscriber, #60262) [Link]

> You're still missing the point that the latency only matters if the computer is both taking input (e.g. midi keyboard) and also creating the output (e.g. synthesized audio).

The computer could be taking multiple inputs, both midi and audio, and having to send audio output to an effects unit when then comes *back* as input again, and sending midi output to other sound modules which might be sending their audio output directly to a speaker, not back into the computer where it could be buffered to re-sync.

It's really not as simple as just lining up a few different audio sources.

Realtime group scheduling doesn't know JACK

Posted Dec 23, 2010 9:05 UTC (Thu) by nicooo (guest, #69134) [Link]

> But at the end of the day, for your average Joe, occasional latency spikes are not a big deal.

Not a big deal because they don't know that latency spikes shouldn't happen in the first place. Like Windows users who think it's normal to reboot the computer every time they install/update a program.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds