Alsa is an end-user nightmare

Posted Dec 17, 2006 11:46 UTC (Sun) by nim-nim (subscriber, #34454)
Parent article: Linux desktop architects map out plans for 2007 (Linux.com)

Alsa is very powerful and ISVs would have adopted it long ago if end-users had pushed for it (all the technical grumbles notwithstanding). However Alsa has failed so far to attract user backing.

Alsa exposes all the raw inputs, outputs and knobs of a chip. But it does not provide any user-friendly tool to manage them. The best it's done is defining some default profiles the user can select on setup, are not tweakable nor exposed in the apps.

Where is the user-friendly GUI app that allows managing Alsa configuration?
1. That displays :
input-output routing
what part of the pipeline each knob controls
where software processing blocks (format conversion, effects, etc) can be inserted
2. that allows selecting controls that will be displayed in app UIs knowing what their effect will be
3. that allows testing the setup by sending sounds to inputs and hearing where they emerge
4. that saves profiles under user-friendly names (music setup, gaming setup, dvd-viewing setup) and lets users switch easily between them in sound apps
5. that allows defining the events that will trigger switching to a profile automatically (for exemple, app tries to play 5.1 sound, give it the 5.1 routing profile)

Instead all this info stays in the alsa driver writer head, and alsa dumps the control list on the end user hoping he can do something with it. In practice you'd better hope one of the default profiles works for you (and is honoured by all you apps) because that's as far as you're likely to get.

I suppose in the pro-audio world people can afford defining static audio pipelines manually, and don't need to switch between listening CDs, viewing DVDs, playing games, audio-conferencing but basic end-users need more hand-holding. Even trivial stuff like mapping mono sound to both stereo outputs does not happen in alsa world today. Users have no access to the power that may have them like alsa, in fact alsa makes it so hard to do anything often they can't manage basic stuff either.

Instead alsa doc focuses on helping users to break the distro kernel by replacing the perfectly sane (if misconfigured) drivers the distro ships with ones that won't work with the system alsa lib versions. I suppose when you have no tool to help users configure their setup switching to the latest version and hoping its defaults work better is all you can do. But it's not helping alsa go anywhere

Alsa is an end-user nightmare

Posted Dec 19, 2006 0:47 UTC (Tue) by jd (guest, #26381) [Link] (2 responses)

Ideally, you'd have four distinct APIs, rather than one ugly do-everything-badly API. The only "real" API would be the raw API, the rest would merely be wrappers that made it possible for humans on the planet Earth to actually write code and have people use it without their brains exploding. 'm thinking something along the lines of:

Basic API that provides access to (essentially) universally-present and (essentially) universally-used features. This would exclude all the fancy stuff, all the card-specific stuff, all the app-specific stuff. This is all you want for the system chimes, basic mixers, the bulk of DVD/CD players and the bulk of games. Lightweight, easy, maintainable, documented.
Extended API that provides access to stuff that's not in your basic realm, but the bulk of users WILL want these features enough of the time that you can't fob it off.
Advanced API that provides an abstract API to every other function that can sanely be abstracted. Anything that simply can't be abstracted or anything that is so unique that the abstraction is going to be identical to the specific case would not be in here. This is where you'd put all the REALLY fancy controls for professional sound apps, sound processing, etc.
Raw API that provides a pseudo-abstraction to access all of the functions on all of the cards. Absolutely everything can be done through a call to this and ultimately ALL sound operations will be done through a call to this. There is no finer-grain than a fully-exposed API. Great for development, great for totally specialist applications and embedded systems, great for raw speed, great for causing programmer brains to explode.

Alsa is an end-user nightmare

Posted Dec 19, 2006 13:32 UTC (Tue) by nim-nim (subscriber, #34454) [Link] (1 responses)

Grr, wanted to reply here and replied there (http://lwn.net/Articles/214858/) instead

Alsa is an end-user nightmare

Posted Dec 19, 2006 17:57 UTC (Tue) by jd (guest, #26381) [Link]

Unlike some of the managers I have the unfortunate luck to be saddled with, I have this nasty tendency to listen to people who point out flaws in my ideas. :)

I appreciate your comments and, yes, you are right that users will want full access to the cards. (And sometimes that will be plural. Less so these days, but it wasn't that long ago when if you didn't have both an LAPD1 and a Soundblaster, you didn't have a sound system worth a damn - and programs were written to expect you to have both.)

The tendency has been to push the intelligence into the libraries and drivers, so that apps writers describe what they want and the environment they run over does all of the logic to decide how best to achieve the results. This makes apps writing relatively easy, but does have the penalty that you're limited to what can be achieved (in some form or other) across the board. Multiple APIs you can blend as needed is an improvement, in that applications can choose to take back control on an as-needed basis, but it doesn't solve the problem that applications are still going to be limited to lowest common denominators for a lot of operations.

The more access you want, the higher up you need the intelligence. This is true for any hardware - the less you can assume, the more you have to decide. There are still levels of encapsulation you can do - SCSI vs. IDE, Ethernet vs. ATM, or whatever. This allows you to stack stuff together, so that you only need to bubble up what absolutely can't be done at a lower level, and it only needs to bubble up so far and no further.

The other thing to consider is that sound cards are generally presented as cards. One physical entity with a fixed number of physical channels associated with it. Almost no other device on the system is exposed in this manner. They are all exposed per virtual channel, where one virtual channel may encompass any designated fraction F of N physical channels across M physical devices. Applications just see the virtual channel, they neither need to know or care about the values of F, N and M.

Because sound is treated at a purely physical level, there is much less experience in virtualizing sound than for other systems. Maybe this is something that needs to be addressed.