Realtime group scheduling doesn't know JACK

Posted Dec 20, 2010 22:29 UTC (Mon) by dlang (guest, #313)
In reply to: Realtime group scheduling doesn't know JACK by jimparis
Parent article: Realtime group scheduling doesn't know JACK

buffers work if you don't mind delay between the application and the real-world output.

however if you are trying to do things in real-time, you can't afford to have large buffers as you need the real-world output to happen as close to the input as possible.

so to minimize the buffers, they need to be able to be sure of having CPU time when it's needed.

Realtime group scheduling doesn't know JACK

Posted Dec 20, 2010 22:43 UTC (Mon) by jimparis (guest, #38647) [Link] (39 responses)

> however if you are trying to do things in real-time, you can't afford to have large buffers as you need the real-world output to happen as close to the input as possible.

Right, so you're saying it's latency. But why is this such a big deal for audio? The latency between a user's input and the response comes up everywhere in an interactive system (moving the mouse and seeing the arrow move, clicking to close a browser tab, pressing spacebar to pause your mplayer video, shooting a gun in a video game, etc) and yet you don't hear every other developer complaining about latency. Why is the latency between audio input -> audio output be any more difficult than the latency between USB input -> video output?

Realtime group scheduling doesn't know JACK

Posted Dec 20, 2010 22:50 UTC (Mon) by dlang (guest, #313) [Link] (7 responses)

because the output then gets mixed with the input in the human ear, and the human ear can hear very short delays

when you are in a game and there is a slight delay between hitting the trigger and hearing the sound, it's not really that big a deal, but if the sound from the speakers is delayed from the sound directly from the audio source, it is very jarring.

Realtime group scheduling doesn't know JACK

Posted Dec 20, 2010 23:21 UTC (Mon) by jimparis (guest, #38647) [Link] (6 responses)

> because the output then gets mixed with the input in the human ear, and the human ear can hear very short delays

The eye can see very short delays too. Certainly moving the mouse or typing on your screen would be a lot less enjoyable if you had 200ms lag in there. And lag alone can't be that big of a problem with audio -- just standing 20 feet away from the speaker will add 20ms on its own.

> when you are in a game and there is a slight delay between hitting the trigger and hearing the sound, it's not really that big a deal, but if the sound from the speakers is delayed from the sound directly from the audio source, it is very jarring.

OK, so it sounds like that concern isn't directly lag, but rather having two sources of audio that are out of sync. And this can often be accounted for by just delaying the faster source slightly to match the delay of the slower (laggier) one. That's how e.g. video games like Rock Band deal with the fact that modern TVs can have quite large lag -- they let you delay the audio so that the audio and video reach the user at the same time.

Can you provide a more specific example setup where you'd see the problem you describe?

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 0:05 UTC (Tue) by dlang (guest, #313) [Link]

200ms of lag is noticable, but 90ms of lag when typing is probably not

however, 90ms of lag in mixing live audio is very noticable.

the difference is what you are comparing to.

your eyes can detect latency, yes, but they detect it by comparing different visual clues, finger (keyboard) to screen is not all visual.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 5:26 UTC (Tue) by jebba (guest, #4439) [Link] (2 responses)

> Can you provide a more specific example setup where you'd see the problem you describe?

I'm not sure this answers your question exactly, but I used jackd/ardour with live musicians and they can definitely notice if the system is set to higher latencies when you have a mix of live players and earlier recorded tracks. It was easy enough to get the latencies low enough, but it was interesting to me to see them definitely notice the difference.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 17:42 UTC (Tue) by drag (guest, #31333) [Link] (1 responses)

To get a idea of the scale we are talking about...

The speed of sound in a low humidity environment is going to be something like 350 to 400 meters per second. I've heard that humans can't distinguish 10msec from instantaneous. That would be somebody standing about 4 meters away from you making a sound.

Probably 100msec is acceptable, I figure. From end to end. It'll sound like your standing a ways away in terms of response, but you can compensate.

But if you tune your system to deliver 30msec performance and Linux will randomly have latency spikes past 150 msec every time a disk is accessed then it's worthless to you.

Realtime is about being able to provide deterministic latency.... Linux can't deliver reliable performance without realtime configurations. You can't really tell what latencies you can rely on because the system is not configured to behave in that way. It will be optimized for batch processing and server workloads and in that you can latencies all over the place as long as that will deliver the best performance in the long run.

Realtime group scheduling doesn't know JACK

Posted Dec 22, 2010 16:24 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

>The speed of sound in a low humidity environment is going to be something like 350 to 400 meters per second. I've heard that humans can't distinguish 10msec from instantaneous. That would be somebody standing about 4 meters away from you making a sound.

Human ear can distinguish about 500 microsecond difference. I.e. a difference in distance of about 20 centimeters. Humans use this for directional hearing.

That's why smallest possible latencies are so important.

Realtime group scheduling doesn't know JACK

Posted Dec 22, 2010 19:07 UTC (Wed) by jrigg (guest, #30848) [Link]

>Can you provide a more specific example setup where you'd see the problem you describe?

Recording musicians. If there's no way of monitoring what is being played direct from hardware (which is sometimes the case) there needs to be minimal latency between a musician playing something and hearing it back from the recording system, otherwise the musicians can't play in time with each other. If you're paying top money for studio time and session musicians, what might be near enough for a "Guitar Hero" type game isn't good enough. A 20ms delay is enough to cause a problem (musicians don't tend to stand 20 feet from each other in a studio).

There's also the situation when recording or mixing a lot of tracks, possibly with lots of DSP going on, you need certain things to happen within a short deadline to avoid audible dropouts. Large buffers help, but if your system decides to put audio on hold while it does something else it's pretty inconvenient (and in professional audio, inconvenience = cost).

Realtime group scheduling doesn't know JACK

Posted Jan 6, 2011 13:18 UTC (Thu) by farnz (subscriber, #17727) [Link]

Practical, real-world example. You are playing electric guitar, accompanying a friend who's singing; you've plugged the guitar and the microphone into the computer, which is recording the raw signals (for later editing), plus acting as a guitar amp, effects box and audio mixer to give you immediate audio feedback in your headphones (basically a first cut of how the final recorded track will sound).

You can't delay the (quiet) sound from the guitar strings near you; nor can you delay the loud sound from the friend who's singing. All you can do is minimise the latency in your computer, so that what you hear in your headphones (with effects applied and mixed in) is as close in time to what you're playing to give you a decent experience.

Remember, too, that you may well be experimenting ("jamming") - if the singer changes key, the guitarist may want to follow suit (and vice-versa); as that sort of behaviour is unpredictable, you can't try and play ahead to compensate for the latency.

Realtime group scheduling doesn't know JACK

Posted Dec 20, 2010 23:12 UTC (Mon) by strcmp (subscriber, #46006) [Link] (15 responses)

In this context real-time means e.g. applying sound effects to music while it is played on an instrument in a band. You don't want to add a delay effect to everything, the sound has to mix correctly with the other instruments and what you hear directly from the same instrument. Everything above 30ms or so for the complete signal way is noticable. As I understand it, latency is the main drawback of digital effects compared to analog effects.

Realtime group scheduling doesn't know JACK

Posted Dec 20, 2010 23:29 UTC (Mon) by jimparis (guest, #38647) [Link] (14 responses)

Having to "mix correctly with the other instruments and what you hear directly from the same instrument" seems like one of the (few) cases you can't account for with just intentional delays and buffering, I agree with that.

But most of what I've seen on the topic seems to be complaining about other problems. The JACK FAQ says "JACK requires real-time scheduling privileges for reliable, dropout-free operation". I guess maybe that's just because they set their buffers so small to try to reduce latency. For users who aren't interested in the use case you described, it would be much easier to just increase the buffers and allow 50ms latency rather than have to go through all this trouble to try to get the kernel to do realtime.

Realtime group scheduling doesn't know JACK

Posted Dec 20, 2010 23:54 UTC (Mon) by strcmp (subscriber, #46006) [Link] (13 responses)

If you couple multiple digital effects in a network, it gets too complicated and creates too much overall latency to add delays. Imagine input going both directly and via effect A to effect B, from there going to output both directly and via C and D in series. you have to add delays parallel to A and C/D, _and_ because A, B, C and D are high latency to begin with, they add to an even higher delay. Imagine pressing a key just in time and hearing the sound half a beat later, if your music has 180 bpm every beat is 330ms and you can have 2 or 4 notes per beat. This should be limited to pneumatic pipe organs, where it's hard enough to play with other people.

Of course JACK is just meant for music production, you don't need it for everyday sound.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 6:52 UTC (Tue) by drag (guest, #31333) [Link] (12 responses)

"Realtime" does NOT mean low latency. It _CAN_ mean low latency, if that is what the user wants and the hardware is capable of delivering. What it really means is that latency is deterministic.

For audio applications what matters much more is that everything is predictable and you keep end to end latencies controllable. Imagine a concert pianist with a piano that will produce a sound randomly from instantaneous to quarter of a second to produce sounds when they press the keys. Some sounds may even come out of order. This is fine with TCP/IP, but Linux-based digital audio workstations with midi controllers and such that sort of performance would be unacceptable.

Linux under common configurations you will latencies vary by huge amounts. It is simply not capable of processing audio in a reliable enough manner that it can be suitable for audio production. You will see latencies randomly within a range of 10msec to 500 or a 1000 msecs depending on system loads. (probably 150-250msecs is very common) And system loads can change quite a bit... like double clicking on a fill to open a new MP3 sample, or aptd daemon kicking off and checking for security updates, or user alt-tab'ng between applications... Any disk activity is usually pretty damning.

What makes it worse for Linux (from a development perspective) is that high performance workloads often require a system with extremely lousy realtime performance. Think of server-style/database/batch workload. That way you take advantage of cache and all that happy stuff to make your processes go fast. Nice long, continuous reads from disk into memory, nice long lasting level3 caches to make your loops go fast, etc etc.

With a realtime-style workload you are going to be more then willing to flush those buffers and take page hits and all that stuff all over the place in exchange for the ability to say "X needs to get done every 10msec no matter what". So when Linux is configured for good realtime performance you will often see a dramatic loss in system efficiency if there is a realtime process actually running.

For audio work it's really important that you keep feeding the sound card and other devices audio data at a predictable rate. The hardware is effectively a "First In First Out" buffer that is always going to produce output regardless of whether or not you provide it information. Even if your system does not process the audio quick enough it is not going to stop your hardware from playing or recording audio information. As a result a system with poor realtime performance will create a large number of audio artifacts when loaded. You'll get echos, pops, squeeks, pauses, out of sync audio, out of sync video, stuttering video, scratches, etc etc and all other type of really bad things in your recordings and performances.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 18:12 UTC (Tue) by jimparis (guest, #38647) [Link] (11 responses)

> For audio work it's really important that you keep feeding the sound card and other devices audio data at a predictable rate. The hardware is effectively a "First In First Out" buffer that is always going to produce output regardless of whether or not you provide it information. Even if your system does not process the audio quick enough it is not going to stop your hardware from playing or recording audio information. As a result a system with poor realtime performance will create a large number of audio artifacts when loaded. You'll get echos, pops, squeeks, pauses, out of sync audio, out of sync video, stuttering video, scratches, etc etc and all other type of really bad things in your recordings and performances.

To solve the problem of buffer underruns, you need to increase the size of your buffers. If you cannot increase the size of your buffers due to latency concerns, or if you have hardware limitations, _then_ I agree you need realtime. But it seems to me that the vast majority of "I'm getting xruns with jack, I need realtime!!" type postings that I see can be solved better by increasing software buffer size, because they are use cases that don't also require low latency (e.g. playing one track while recording another, or playing and live mixing multiple tracks at once, etc). Maybe my perception is just wrong, and low latency is really a requirement more often than I think. Or maybe JACK does need to better handle the case where latency is OK and expected, and account for it rather than trying to push it under the carpet.

In some of my research work, we deal with streaming data from various capture cards at relatively high rate (up to 2Mbit/s). We write it directly to disk on a relatively busy system that can go out to lunch for 500ms or more. The only time I ever had to invoke SCHED_FIFO was with some poorly-designed USB hardware that only had a 384 byte buffer. Now we design our hardware with at least 1MB buffer and the kernel scheduler is no longer a concern.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 18:20 UTC (Tue) by dlang (guest, #313) [Link]

part of the fun is that it's hard to know ahead of time if you can afford large buffers or not.

when you are building and configuring a system, how do you know if the output is going to a recorder (where latency is acceptable), or to speakers (where it's not)

since there are some cases where you need the real time response, why not use that setting all the time rather than having one set of settings if you are going from live to live and a different set of settings if you are doing something where latency is acceptable?

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 18:43 UTC (Tue) by drag (guest, #31333) [Link] (6 responses)

> To solve the problem of buffer underruns, you need to increase the size of your buffers. If you cannot increase the size of your buffers due to latency concerns, or if you have hardware limitations, _then_ I agree you need realtime. But it seems to me that the vast majority of "I'm getting xruns with jack, I need realtime!!"

No.

With Linux without realtime configuration you can't make your buffers large enough to avoid xruns, unless they are very large, because your not able to control your latencies.

That's the point. Realtime gives you control over the latencies. Of course Linux can't give you hard realtime, but certainly the level that Linux can provide when properly configured is enough for audio production.

> In some of my research work, we deal with streaming data from various capture cards at relatively high rate (up to 2Mbit/s). We write it directly to disk on a relatively busy system that can go out to lunch for 500ms or more. The only time I ever had to invoke SCHED_FIFO was with some poorly-designed USB hardware that only had a 384 byte buffer. Now we design our hardware with at least 1MB buffer and the kernel scheduler is no longer a concern.

Does your data corrupt and get nasty artifacts when you get latencies up to 500msecs? Because that is what happens with audio streams. You not just dealing with recording data to disk, your interacting with the real world that won't be able to wait on your computer to respond...

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 19:46 UTC (Tue) by jimparis (guest, #38647) [Link] (5 responses)

> Does your data corrupt and get nasty artifacts when you get latencies up to 500msecs? Because that is what happens with audio streams. You not just dealing with recording data to disk, your interacting with the real world that won't be able to wait on your computer to respond...

No, it doesn't corrupt. And it's real world data coming in. And I also didn't have any corruption or artifacts when I had 500msec latency when I was running a MythTV backend recording two streams of live television simultaneously. And I can play DVD video just fine even though my old DVD-ROM drive has seek times on the order of 250ms. Etc, etc.

That's my whole point -- there is a whole world of other systems that (to me) seem much more demanding than audio processing, like video input and output, and gaming, that don't seem to have the realtime requirements or problems of JACK. I'm trying to understand exactly what the audio use cases are that make processing 48 KHz data so difficult. And it seems like the answer is the specific case of live looping back audio from input->output which needs low latency.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 21:25 UTC (Tue) by tialaramex (subscriber, #21167) [Link]

Not just audio input, any case where input passes through the JACK system to become output. That's what latency is! For example, from thumping the concert A on a MIDI keyboard to hearing the sound from the software instrument. The larger the buffer, the longer it takes.

Or consider DJing. Suppose we just have a 2 second buffer. So I hear the end of track A, there's a boring instrumental lead out, I want to play track B - must I wait 2 seconds before it begins? That's a long time. OK, so maybe we provide the DJ a separate mix. Now I can hear where track B will start playing, but it takes 2 seconds before I hear my own decisions. This is very difficult to work with.

You can play with this, if you have lab audio equipment, create a 2 second delay audio loop, where what you say is recorded and then played back quietly through headphones (so it doesn't feed back). You won't be able to hold a conversation properly, because you must concentrate very hard just to override a built-in audio-cognitive mechanism that tells you you're being interrupted. Reduce the loop to 15ms. Now it's fine. The much shorter delay resembles environmental echo, which was present when our audio cognition evolved. Latency matters.

Realtime group scheduling doesn't know JACK

Posted Dec 22, 2010 12:25 UTC (Wed) by drag (guest, #31333) [Link]

> That's my whole point -- there is a whole world of other systems that (to me) seem much more demanding than audio processing, like video input and output, and gaming, that don't seem to have the realtime requirements or problems of JACK. I'm trying to understand exactly what the audio use cases are that make processing 48 KHz data so difficult. And it seems like the answer is the specific case of live looping back audio from input->output which needs low latency.

It's not just _LOW_ latency.

Realtime configurations give you the ability to _CONTROL_ latency.

That's the point.

Often stuff running realtime will have significantly worse performance then if you had them in the normal configuration.

Linux is designed to provide high throughput and work efficiently under high loads. The design choices and compromises that goes into making Linux behave well and perform well under high loads is COUNTERPRODUCTIVE to providing good realtime performance.

It's not a question on how demanding the workload is, it has to do with your ability to control how often something happens and when it happens.

THAT is what realtime performance is about. Not making your system work faster, quicker, or perform better... it's just a question of being able to control latencies. Without realtime configurations you can see latency spikes up to 250msec or more on a regular basis for Linux. It's just a nature of the beast.

And of course even with Linux in 'realtime mode' it's not perfect and can't guarantee you with 100% latency control.. Linux is far too complicated for that, but it does help tremendously.

Realtime group scheduling doesn't know JACK

Posted Dec 22, 2010 20:09 UTC (Wed) by jrigg (guest, #30848) [Link] (1 responses)

>I'm trying to understand exactly what the audio use cases are that make processing 48 KHz data so difficult.
If you're processing a lot of tracks simultaneously the throughput can be quite high. In film soundtrack work it's not uncommon to be working with a hundred or more tracks at once.

Realtime group scheduling doesn't know JACK

Posted Jan 6, 2011 19:38 UTC (Thu) by xorbe (guest, #3165) [Link]

Dude. Let's run the numbers on something crazy, like an 8 channel audio with insane bitrate and huge number of streams.

192000 samples/second * 8 movie channels * 4 bytes/sample * 128 streams / 1048576

That's a whole 750MB/s ... my desktop computer has 30GB/s dram bandwidth. For the highly more common 48K * 2 ch * 2 b/sample * 32 streams, the bandwidth is a pathetically small 6MB/s.

The fact that the Linux kernel can't natively handle mixing a few audio streams for the average desktop user without trotting out the anti-DoS argument is silly.

Realtime group scheduling doesn't know JACK

Posted Dec 23, 2010 6:37 UTC (Thu) by omez (guest, #6904) [Link]

Depending on your game, you can easily replicate the problems JACK is suffering. A player can learn to see past consistent quarter-second network latencies in a fast paced game. If the apparent latency changes because the kernel wants to contemplate its navel for 100ms, however, all bets are off. You will be a sitting duck, fragbait.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 21:09 UTC (Tue) by tialaramex (subscriber, #21167) [Link] (1 responses)

Yes, your perception is wrong. To build professional audio systems you need to care about latency nearly all the time. If your point was "JACK is not necessarily the right choice for playing a 'Ding!' sound when I receive new mail", you have our agreement. But that's not what JACK users are using it for.

It all matters, right down to when I hit a button I want it to take effect immediately. I understand as a programmer that it can't happen literally the moment I press the button, but as the milliseconds drag on the delay becomes perceptible, and then quickly annoying. On a typical PC JACK can get that delay down to the almost imperceptible where a typical non-realtime audio framework struggles to keep it sub-second.

Professional musicians can cope with and compensate for latency up to a certain point. Playing a huge stage, or with some types of instrument that inherently have a latency, you learn. But we're talking low tens of milliseconds at most.

JACK isn't alone here, this a long studied problem. You need buffers, and we have buffers, up to a few thousand frames (samples x channels) which equates to 100+ milliseconds - but beyond that you're just not professional audio any more. And Linux can easily stall non-realtime processes for that long.

If latency didn't matter, there'd be no difference between a realtime NLE and the old batch processed, make a decision and render systems. But in reality the former are tremendously more productive AND require a lot less practice.

[ JACK has a mode for people who don't care about time, freewheel mode, in this mode audio clock time passes as quickly as your CPU can process samples, and of course you don't use RT priority because the machine would appear to freeze indefinitely. ]

Realtime group scheduling doesn't know JACK

Posted Jan 4, 2011 17:37 UTC (Tue) by nye (subscriber, #51576) [Link]

>On a typical PC JACK can get that delay down to the almost imperceptible where a typical non-realtime audio framework struggles to keep it sub-second.

Ouch. My only experience with lowish-latency audio on Linux was back in 2005 when I was trying to make WoW in Wine as responsive as possible. At the time, the buffer could be squeezed down to about 90ms, but any lower would risk very occasional underruns when under heavy load (and always caused by disk activity, not CPU) - becoming quite frequent underruns by the time you got to a target latency of 50ish. Using rt could push it a couple of 10s of ms lower but wasn't really necessary for gaming and came at a cost in throughput.

If we now need rt scheduling to avoid being several times worse than we were on cheap hardware years ago, then something is surely wrong. I wonder if this difference is due to software, or the continued decline in quality of audio hardware (eg. all mixing is done in software nowadays because the hardware can't do it).

Realtime group scheduling doesn't know JACK

Posted Jan 11, 2011 23:53 UTC (Tue) by davepl (guest, #72324) [Link]

I must step in here, firstly I look after multimedia apps and libs for openSUSE and the largest and most prominent packaging "todo" I have, is a jack that just works when installed. I'm also a hardware person majoring in games, controllers and juke boxes.
Jack needs RT for latency in response time. Jack is basically a bunch of virtual wires which connect the virtual desks, effects, etc. which make up a very nice and usable array of professional audio applications that run on linux. We all know what happens if we use a defective lead to connect our dvd machine to the fancy home theater system we just bought, the same thing happens to jack if it's looking after a couple of synths, a few filters and an external midi keyboard. My pet package rosegarden won't even start jack if it (jack) doesn't have realtime rights with the kernel and as a result rosegarden doesn't produce sound.
On the other side of the card, one doesn't mix professional audio processing with servers and the like, my celeron E1200 1 gig ram and only one ide port with my dvd writer and only system hard drive hanging on it (saving up for a sata drive to clear a major system bottle kneck) is certainly not suitable as a professional anything workstation but it produces clean audio as long as nothing else is hogging I/O with a crappy HD onboard sound which uses up system resources (I've at least got a mother board with potential the E1200 is the slowest cpu it will work with). Audio is slow in frequency and doesn't have high bandwidth requirements like for example gaming 3D which is taken care of by the GFX card. A professional sound card has the audio processing and audio buffering (ram) to take the load of the system the same as a high end gaming graphics card does and I must add that the providers of this type of hardware are generally happy to stay in the unfree but profitable world of microsoft, have you ever heard of high end audio or video hardware without a shiny windows installation dvd. What the blinded by windows masses don't realize is linux has these drivers already built in to the kernel mostly and rosegarden for one has a directory full of hardware midi drivers and if you don't find yours, the developers will try their damndest to make one for you.

The only blot on linux's copybook is the connecting cables - Jack having to disconnect and reconnect (please don't take this as an intellectual statement, it's simply an attempt to simplify my explanation) to accommodate daemons with higher privileges that it isn't fast enough to keep up with and on a multiplexed system RT is a major privilege which gives the application the power to screw up kernels scheduling of other software. Linux being traditionally a server work horse and relatively new to the role of PC is naturally cautious about allowing things that can cause denial of service problems. I innocently asked about jacks rpm writing to "/etc/security/limits.conf" on installation and was met with shock from the old school packagers, well just one actually who stated "I'd veto such abuse of the audio group" and would be likely to do so if I attempted to submit such a jack to factory. It was also hard to explain that jack was started by various user programs and not the user itself. We must also take into account the fact that jack can use the network as well which makes it more of a security threat (Don't tell but I sneaked a jack2 replacement for jack without remembering the aforementioned). At the end of the day though jack isn't going to be used on an apache production server but on a box that has that side of linux "dumbed down"

Jack needs RT to enable it to respond in real time to simultaneous requests from the likes of rosegarden and hydrogen otherwise the drummer will be out of sync with the rest of the band or worse still jack will become overwhelmed by the quantity of tasks that it's waiting to perform and buckle under a denial of service attack from the kernel it trusts and just freeze up, this really happens. and the users that have had linux fervently pushed upon them will shuffle back to windows and qbase be it pirated or not, the poor have no conscience about cash before copyright whereas if jack ran in real time, linux would have converted criminals into honest people.
With my hardware hat on, most audio latency problems like "echos, pops, squeeks, pauses" can be taken care of by heaps of ram or very efficient virtual memory, a couple of raid stripes but what cannot be taken care of by buffering is jacks role of conductor in the orchestra of impatient audio apps.

It's refreshing and reassuring to read the comments of the libcgroup developer, there is a light at the end of the tunnel and I must research this further, maybe we can have a "working out of the box" jack in openSUSE 11.4.
I hope my story has enlightened the doubters as to why jack needs realtime, why? for the furthering of the open source cause, that's why.
I'm speculating that libcgroup has fresh linux blood in it that cares about linux the PC as much as linux the server.

I can assure the doubters that jack needs RT privileges due to it's role as a server, the word server as in X server not as in samba server, which connects a network of applications to create audio compilations, the word "network" not to be confused with the network which is translating my key presses to this web page but more like a network of business that compliment and help each other. Jack doesn't need RT like a custom built car needs custom parts and paint work but like a tractor needs special wheels to pull the plough through the field.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 20:19 UTC (Tue) by cmccabe (guest, #60281) [Link] (13 responses)

> Right, so you're saying it's latency. But why is this such a big deal for
> audio? The latency between a user's input and the response comes up
> everywhere in an interactive system (moving the mouse and seeing the arrow
> move, clicking to close a browser tab, pressing spacebar to pause your
> mplayer video, shooting a gun in a video game, etc) and yet you don't hear
> every other developer complaining about latency. Why is the latency
> between audio input -> audio output be any more difficult than the latency
> between USB input -> video output?

That's a very good question. Part of the answer is that ordinary users DO care about latency. Part of what held back garbage-collected languages for so long was users' annoyance with the long and nondeterministic delays. For example, Java on the desktop in the nineties was just no fun at all.

But at the end of the day, for your average Joe, occasional latency spikes are not a big deal. You set your xmms buffer to a big size and get on with your life. If a latency spike happens when you're playing a game or clicking 'save' in your spreadsheet program, that's mildly annoying, but hardly a major problem.

For audio professionals, an occasional latency spike is a real problem. If you're in the middle of composing something on a MIDI keyboard and the computer randomly inserts a long delay, making it sound wrong, you're going to be annoyed. You're going to lose work and time. People can notice audio delays that are more than about 10 ms.

You have to realize that a lot of compositions are done in layers so that you have something playing, and you add another track to it. In those cases, it's critical that the time between pressing the key on the piano keyboard and registering it be kept to a minimum. Buffering will not help you there.

This question is a little bit like asking "I pulled out MS NotePad and started typing some code. It seemed to work fine! Why do programmers need all these fancy editors with macros and whatnot?" Well, they have different needs.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 21:18 UTC (Tue) by jimparis (guest, #38647) [Link] (11 responses)

> If you're in the middle of composing something on a MIDI keyboard and the
> computer randomly inserts a long delay, making it sound wrong, you're
> going to be annoyed. You're going to lose work and time. People can
> notice audio delays that are more than about 10 ms.

That's assuming that Linux is in charge of measuring the timing between MIDI keypresses, and therefore a delay in Linux will add a delay in the music. Why is that the case? Is MIDI hardware really that bad? I would hope the MIDI hardware sends data to the computer that looks like:
1. Pressed C at 12:00:00.102
2. Pressed D at 12:00:00.112
3. Pressed A at 12:00:00.122
4. Pressed B at 12:00:00.132

If the computer is busy for 25ms and misses #2 and #3, that's fine, it will read those events out of a buffer when it returns. The data will still show a keypress every 10ms as intended.

If MIDI hardware is exceptionally stupid and requires the computer to perform all timings, then it sounds like there's a market for a very simple hardware enhancement that would avoid this sort of issue.

> You have to realize that a lot of compositions are done in layers so
> that you have something playing, and you add another track to it. In
> those cases, it's critical that the time between pressing the key on the
> piano keyboard and registering it be kept to a minimum. Buffering will
> not help you there.

In this case I'd argue that the output and input streams need only to be _synchronized_, not that there is zero-latency between them. Assume you have a huge buffer for audio output, and a huge buffer for the MIDI input. The computer should be able to know that "output sample #12345 made it to the speaker at 12:00:00.123" and "keypress C on the piano was registered at 12:00:00.123" based on either measured latencies, reported timestamps from the relevant devices, or sample rate info plus manual adjustment (like Rock Band's calibration screen). Even if the computer is busy and doesn't actually register the keypress until 12:00:00.456, you can still with 100% accuracy match the keypress to the proper place in the music, so that when you go to play it back later you can put it exactly where the musician intended.

It really seems to me like the approach of audio guys is to try to get latency to zero, rather than accepting variable latency and simply accounting for it with larger buffers and proper timestamping. Zero latency seems like a losing battle because you'll never reach it, so for use cases like you described where you can correct for the effects of latency, that seems like a better solution.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 21:35 UTC (Tue) by jebba (guest, #4439) [Link]

> It really seems to me like the approach of audio guys is to try to get latency to zero, rather than accepting variable latency and simply accounting for it with larger buffers and proper timestamping

Uh, no. In fact, you can't do jackd with 0 latency and that isn't the goal. RTFS. Your scenario is FAIL and it is clear you have never used it in production. Why not actually listen to the people that actually have done this?

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 21:36 UTC (Tue) by sfeam (subscriber, #2841) [Link] (9 responses)

You are clearly speaking from the perspective of someone who is not a musician. If you're the one playing the keyboard, you really do notice and suffer from a lag in response on the order of 10-20 msec. Buffering in the computer doesn't help; you'd need the buffering to happen in your brain. Yes, you can train yourself to deal with it up to a certain point. It's like playing as part of a live ensemble split between two sides of a large stage. You are focusing on the path after the signal enters the computer, forgetting that the human brain is on both the input and output side. The musician needs what the ear hears to be in sync with that the hands play.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 21:44 UTC (Tue) by jimparis (guest, #38647) [Link] (8 responses)

You're still missing the point that the latency only matters if the computer is both taking input (e.g. midi keyboard) and also creating the output (e.g. synthesized audio). Surely you would agree that latency doesn't matter at all if you are e.g. recording singing from a microphone to accompany a guitar track you are playing back. They can be perfectly synchronized in software to match what you heard while singing.

I think my only real error was underestimating how frequently people need to either apply live digital effects or otherwise play live synthesized audio from live external triggers, which sounds like "always" from what people are saying here.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 21:50 UTC (Tue) by sfeam (subscriber, #2841) [Link]

"Surely you would agree that latency doesn't matter at all if you are e.g. recording singing from a microphone to accompany a guitar track you are playing back."

Next time you go to a performance with amplification, pay attention to what happens during the sound check. Pay particular attention to the small monitors near the front of the stage that point back at the performers so that they can hear the mix of what they are singing into the mic with the rest of the input. Chances are that you'll hear the performers request more or less feedback from the monitor, because getting it right makes it a whole lot easier to perform. You can't just sing, or play, into a vacuum and somehow know that it's coming out right. The feedback of your own mechanical actions, voice or fingers, to your ears is crucial.

Realtime group scheduling doesn't know JACK

Posted Dec 21, 2010 23:21 UTC (Tue) by chad.netzer (subscriber, #4257) [Link]

"Surely you would agree that latency doesn't matter at all if you are e.g. recording singing from a microphone to accompany a guitar track you are playing back."

Nope. Singers like to hear some of their voice fed back to them. It's not uncommon for unconstrained musicians to move to the corner of a room, to best hear the reflection of their voice and instrument (at low latency). If they are using headphones, the audio equipment has to provide that feedback loop. It is not an "offline" operation, where latencies could be ignored (or corrected).

Realtime group scheduling doesn't know JACK

Posted Dec 22, 2010 0:35 UTC (Wed) by baldridgeec (guest, #55283) [Link] (3 responses)

Composing, even electronica, is also "live" unless you're doing it in a MOD editor from the 90s.

When I'm playing my electronic drum kit using Hydrogen to provide the samples, BELIEVE ME, there is a very very noticeable difference between hearing your snare strike 10 ms after your stick hits the pad and 50 ms after your stick hits the pad. 50 ms is still playable if it's consistent, and if you practice at it a bit - some people use Windows, after all, so it obviously works, even if it's not ideal. :)

But 100 ms is not usable at all. A tenth-second gap means there is no relation whatsoever between the part of the phrase in your head (that is currently being conveyed through your hands to the equipment) and the part of the phrase coming into your ears and being interpreted by your brain as "what I'm playing right now." You can't force half of your brain to work 1/16 note in the past at the same time as you play what you need to for the present. It just won't work.

Realtime group scheduling doesn't know JACK

Posted Dec 22, 2010 1:51 UTC (Wed) by nix (subscriber, #2304) [Link] (2 responses)

Actually you *can* train yourself to do it: I can think of several works in which you have to (the Phase works by Steve Reich, for example). But they're rare, and it's difficult, and you really wouldn't want to do it for everything.

Realtime group scheduling doesn't know JACK

Posted Dec 22, 2010 2:04 UTC (Wed) by baldridgeec (guest, #55283) [Link] (1 responses)

(Just looked them up, never heard of them before, interesting! I'll have to find some recordings)

From what it says on Wikipedia, it's played as a duet - i.e. the music you play is still in time with itself; your part is in phase with the other part.

Still hell to play, yeah, but it would be pretty much impossible if the sound from your own piano were to come at you with an audible delay. It would be easier to play if you were deaf - at least then nothing would interfere with the rhythm in your head.

Realtime group scheduling doesn't know JACK

Posted Dec 22, 2010 12:00 UTC (Wed) by nix (subscriber, #2304) [Link]

Actually one of the ways to play it *is* with headphones that pick up what you're hearing and rebroadcast it to you delayed by just enough (a changing delay): then all you have to do is keep what you hear from becoming phased! (Another common way, and probably the more effective one, is to play the second part without a tape of the first part at all, just with a metronome beat defining when the first part's beats are, then mix the two together later. But you can't do that live and it feels obscurely like cheating. Live performers generally have to do it without any artificial assistance at all, and that *is* hard.)

Realtime group scheduling doesn't know JACK

Posted Dec 22, 2010 0:39 UTC (Wed) by jebba (guest, #4439) [Link]

> Surely you would agree that latency doesn't matter at all if you are e.g. recording singing from a microphone to accompany a guitar track you are playing back.

Not if you are sending that voice back to them so they can hear it in the mix, which of course they have to. Send it to the singer with high latencies, and he looks at you like "WTF??!" because it throws them off rhythm, which is kind of important.

Realtime group scheduling doesn't know JACK

Posted Dec 23, 2010 11:05 UTC (Thu) by jwakely (subscriber, #60262) [Link]

> You're still missing the point that the latency only matters if the computer is both taking input (e.g. midi keyboard) and also creating the output (e.g. synthesized audio).

The computer could be taking multiple inputs, both midi and audio, and having to send audio output to an effects unit when then comes *back* as input again, and sending midi output to other sound modules which might be sending their audio output directly to a speaker, not back into the computer where it could be buffered to re-sync.

It's really not as simple as just lining up a few different audio sources.

Realtime group scheduling doesn't know JACK

Posted Dec 23, 2010 9:05 UTC (Thu) by nicooo (guest, #69134) [Link]

> But at the end of the day, for your average Joe, occasional latency spikes are not a big deal.

Not a big deal because they don't know that latency spikes shouldn't happen in the first place. Like Windows users who think it's normal to reboot the computer every time they install/update a program.

Realtime group scheduling doesn't know JACK

Posted Dec 23, 2010 15:38 UTC (Thu) by tseaver (guest, #1544) [Link]

> Why is the latency between audio input -> audio output be any more
> difficult than the latency between USB input -> video output?

When doing multitrack recording, the performer is listening to the audio
output while singing / playing / speaking in time with it. Any human-
perceptible latency in the loop interferes with her ability to perform
well: lags are *incredibly* painful in this situation.

JACK-with-RT makes it possible to drive the latency down to a few
milliseconds (two on my laptop, using a decent USB audio interface),
making it again pleasant (even possible) to do multi-track work.