|
|
Subscribe / Log in / New account

Linux desktop architects map out plans for 2007 (Linux.com)

Joe 'Zonker' Brockmeier covers the Desktop Architecture Meeting in Portland, Ore. "One of the priorities coming out of DAM3 is fixing the sound experience on the Linux desktop. Sound is a mess on the Linux desktop, and developers are finally starting to turn their attention to clearing up the muddle. McQuillan says, "By far, the most important thing coming out of DAM3 was the understanding that we need the audio/multimedia system to finally settle down and adopt a single robust API.""

to post comments

Not my desktop.

Posted Dec 15, 2006 22:17 UTC (Fri) by jd (guest, #26381) [Link] (11 responses)

Does it bother anyone else that a conference on an open standard for an open operating system, held at an open source laboratory, where said open standard is being developed solely because closed, proprietary technologies and NDAs are savagely limiting progress, is not being reported on because of NDAs and closed, proprietary technologies?

Dunno about anyone else, but whilst I agree that sound is a mess - the sound drivers work on fewer than 50% of the kernels I've installed on my machine, and ARTS does some strange stuff at times - I'll take the instability over behind-closed-doors cloak-and-dagger architectures any day. I can fix a kernel without much trouble, I can't fix an architectural error with nearly the same ease.

The other thing is that we're not going to get a single, robust API, whether we want it or not. Nobody is going to modify the sound system on closed-source proprietary Unixes simply to mollify Linux developers. That is just not going to happen. Which means that it will be one Linux API, plus one API for every other OS that software needs to support. Let's say there's a hundred Unix derivatives out there - going from 103 APIs to 100 APIs is an improvement, but hardly a revolutionary one and certainly not one that is going to mean a damn thing to the Linux desktop.

Not my desktop.

Posted Dec 15, 2006 23:02 UTC (Fri) by sjj (guest, #2020) [Link] (8 responses)

Who cares about sound systems on proprietary Unix platforms any more? Which proprietary Unix is used as a desktop workstation today? Solaris maybe, but seeing that they are pretty eager to market their Linux-compatibility nowadays, why wouldn't they use it?

Besides, OSS is already available as the commercial cross-unix sound system.

What I want on Linux is clean up and document Alsa, integrate with jack and make multiple programs using sound at the same time work by default. Then have one higher-level API. And really, start with documenting Alsa (hint: don't start with instructions how to compile your kernel - it's 2006 after all).

And apps like Amarok that force you figure out the alsa name of a sound card to switch output ("hw:1,0") and type it in, then restart the app to make it stick should be sent to Gitmo. Likewise apps that know what you want and don't even give you an option to change which output to use (coughRhytmbox): what, nobody wants to switch between speakers and headphones, or crappy headset headphones and good music-listening headphones?

Not my desktop.

Posted Dec 16, 2006 0:12 UTC (Sat) by AJWM (guest, #15888) [Link] (2 responses)

> Which proprietary Unix is used as a desktop workstation today?

Arguably, MacOS-X. (Sure, based on BSD, but it isn't BSD.)

Not to take away from your main point, though.

Not my desktop.

Posted Dec 16, 2006 1:11 UTC (Sat) by sjj (guest, #2020) [Link] (1 responses)

Yes, but Apple will most likely do their own thing. Not that I actually know what their sound system is based on.

I do have some fresh scars from wrestling with sound on Linux, as I'm trying to revive music making as a hobby...

But I'd also like to thank everybody who works on this stuff. It really is getting better all the time. Thanks!

Not my desktop.

Posted Dec 16, 2006 11:21 UTC (Sat) by drag (guest, #31333) [Link]

How sound works in Windows is that you have the standard win32 api with kmix software mixer. The drivers take care of setting the hardware devices and mapping them to the stardard (and rather limited) mixer controls and options that are aviable to Windows users.

This is nice because it abstracts away the differences in sound cards so that users are presented by something that is completely standard and familar.

This is bad because it makes the default Windows audio system worthless for audio production. People selling audio cards for professional use and people selling audio production software had to come up with their own sound API for windows to bypass all the kmix stuff and that turned into the ASIO drivers from Steinburg.

----

So the buggery thing about Alsa and such is that it's designed to already allow low-level direct access to sound systems. It has the advantage of the asoundrc and plugins architecture to make it more flexible, but for the end user this is just additional complexity which is not well documented and even harder to deal with.

So you get all this confusion over doing basic mixer controls and setting up just regular desktop recording because it depends on what sound card your using... Each sound card demands a unique setup.

This makes it very difficult to develop documentation and directions and default GUI controls.

I think that if the sound desktop architectures want something then they are going to have to agree on a sound server, so that

I vote for PulseAudio.

I like Jackd, but it's designed specificly for audio production. It should concetrate on that. What your going to end up with if you standardize around jack is a similar situation to Alsa. It's to geared towards sound people and normal people are going to have a tough time dealing with it. Plus normal applications and games don't interface with it. It's something that programs have to specificly adapted to use.

Pulseaudio on the other hand is designed specificly for desktop use. It has a number of attractive features:
- Alsa modules for automatic support from KDE and Gnome alsa applications.
- OSS modules for legacy applications.
- Esound module for legacy gnome stuff.
- Gstreamer plugins.

This means that you have no ackward wrapper scripts for the majority of Linux audio applications, like you had to do with artsd. (you need one for oss compat) No need to reprogram libraries or applications to use it and gain the nice following features...

- Network transparency for X terminals.
- Zeroconfig plugin for declaring a network sound server. This allows easy and automatic setup. Most Linux desktops are beginning to integrate zeroconfig and we have the nice Avahi aviable for doing this.

- Jack module for tying pulseaudio into a jack daemon. This can allow audio professionals flexible choices in playback applications and being able to route sound from normal applications and games into their recordings.

See for more details:
http://pulseaudio.org/wiki/PerfectSetup

BTW the pulseaudio is not new software. Many people may be familar with it's previous incarnation, which was Polypaudio, which itself was designed to solve the limitations presented by esound.

of course the nice thing about it from my perspective is that it's already aviable on my Debian setup.

rhythmbox

Posted Dec 16, 2006 10:03 UTC (Sat) by johill (subscriber, #25196) [Link]

just uses the default gnome setting... go figure...

Not my desktop. -Rhythmbox

Posted Dec 16, 2006 10:58 UTC (Sat) by Uraeus (guest, #33755) [Link]

Actually you can change the sound output of RB using the gstreamer-properties application. I don't know how your system works, but on mine if I plug in headphones the music switched automatically to them. Also in latest GNOME releases the GNOME setttings are HAL connected so if a new sound device is connected you should get a dialog popping up asking if you want to swtich to using that. A lot of work is also currently being done to improve the automated handling of bluetooth devices.

Not my desktop.

Posted Dec 17, 2006 6:08 UTC (Sun) by Junior_Samples (guest, #26737) [Link] (2 responses)

ALSA is a disgrace. What passes for ALSA documentation is an out-of-date, incomplete, wrong, unfinished, mish-mash of broken links to ancient "How-To" guides and dubious FAQs. That an undocumented sound system so complex, unwieldy, and opaque could cling to life this long is a miracle indeed.

Not my desktop.

Posted Dec 17, 2006 19:53 UTC (Sun) by bronson (subscriber, #4806) [Link] (1 responses)

Amen. Alsa's underlying philosophy seems to be, "Expose every feature the driver writer feels like playing with via the mixer interface."

I guess the theory was to push the hardware complexity into userspace where, in theory, it's easier to manage. The driver can then support a rich set of features, present the mixer as its API, and then userspace programs can sanitize things for the end user. This of course is just nuts. This means that application writers either have to present every feature to the user (the creeping horror that is alsamixer today), or write custom, per-card, code to sanitize the interface (er, isn't it the driver's job to abstract the hardware?)

My sound card has 34 mixer levels, of which I only really understand 4 of them. 23 of them don't appear to have any effect on sound output. How do I turn on IEEE958? Well, there are two sliders that control it but damned if I know how to set them properly. Why isn't this a SIMPLE CHECKBOX?!

Arg. So much wasted time. Yes, I too have ALSA scars.

Not my desktop.

Posted Dec 18, 2006 16:56 UTC (Mon) by drag (guest, #31333) [Link]

Well I would prefer to have alsa expose the functionality to user space so that at least it's aviable.

For the majority of users this doesn't matter so much, but occasionally one would want access to some of the more advanced features of sound cards for various reasons.

For example in Windows you have the Win32 interface and windows drivers which abstracts away everything and uses kmix to do software mixing. This is good for normal desktop use, but it is not good for any sort of professional audio use of the computer. It was bad enough that ISVs had to write their own driver model (ASIO) to expose functionality to userland so their applications could take advantage of it.

I don't see any reason why all this policy stuff should be shoved into the kernel...

What alsa does for userland to help make things easier is use of it's plugin architecture and the asoundrc configuration file.

Onen approach that could be used is for sane asound.conf made for each sound card which would provide standard interfaces audio I/O and mixers for desktop use.

So this would require distros and end users to test out their sound cards and then produce asoundrc files according to which card when they can be incorporated back into the alsa distribution and provided by default to end users.

Now asoundrc files are very powerfull. You can setup the ctl.!default for example so that when you open up alsamixer you don't see the hardware-based interfaces, but a software interface.

Another approach would be to use a sound service, like Pulseaudio, to provide abstracted interface with the sound card.

A possible solution would be to have the Alsa create a way to expose sound card hints of functionality through sysfs/HAL/DBUS to indicate driver/sound card aviable functionality and Pulseaudio would take that and then provide standardized capabilities higher up on the software stack.

Already pulseaudio has hal support. It automaticly detects and setup sound cards for it's use. (sound format, frequencies, and stuff like that.).

This has higher level of software complexity and added latency, but you greatly simplify the user and application interface to the sound system and you get features like network transparancy so it ties nicely into the X architecture.

for instance there is a module for pulseaudio were it will check the $DISPLAY and then send the sound to the pulseaudio daemon on that machine.

To get alsa-aware applications to use pulse it's easy... setup a .asoundrc file with this in it (or /etc/asound.conf):
pcm.!default {
type pulse
}

ctl.!default {
type pulse
}

Then your mixer settings and your sound output goes through pulseaudio.
See http://pulseaudio.org/wiki/PerfectSetup

(unfortunately I can get everything working fairly easily through pulseaudio, especially because of alsa's plugin stuff, but I can't get SDL games and VLC to work reliably over a network)

The major major problem we have right now is that we have all these different sound APIs..
OSS, ALSA, Libao, SDL, Gstreamer, Arts, Esound, to name a few. There are others besides that.

Even if Alsa worked 100% of the time for everybody... When you go and try to use a OSS application like Skype (or if sdl is compiled for oss support) then it all turns to shit. Or if arts is running without being configured to use alsa, and lots of other little things.

Not my desktop.

Posted Dec 17, 2006 11:04 UTC (Sun) by nim-nim (subscriber, #34454) [Link] (1 responses)

IMHO that's just because some of the ISVs sent techies without a marketoïd shepherd, and their bosses were afraid the press would immolate them if allowed access

Help! The dierises are catching!

Posted Dec 18, 2006 1:07 UTC (Mon) by xoddam (subscriber, #2322) [Link]

> marketoïd

That one isn't even valid! I've never heard a terminal "oid" pronounced
with more than one syllable.

Alsa is an end-user nightmare

Posted Dec 17, 2006 11:46 UTC (Sun) by nim-nim (subscriber, #34454) [Link] (3 responses)

Alsa is very powerful and ISVs would have adopted it long ago if end-users had pushed for it (all the technical grumbles notwithstanding). However Alsa has failed so far to attract user backing.

Alsa exposes all the raw inputs, outputs and knobs of a chip. But it does not provide any user-friendly tool to manage them. The best it's done is defining some default profiles the user can select on setup, are not tweakable nor exposed in the apps.

Where is the user-friendly GUI app that allows managing Alsa configuration?
1. That displays :
— input-output routing
— what part of the pipeline each knob controls
— where software processing blocks (format conversion, effects, etc) can be inserted
2. that allows selecting controls that will be displayed in app UIs knowing what their effect will be
3. that allows testing the setup by sending sounds to inputs and hearing where they emerge
4. that saves profiles under user-friendly names (music setup, gaming setup, dvd-viewing setup) and lets users switch easily between them in sound apps
5. that allows defining the events that will trigger switching to a profile automatically (for exemple, app tries to play 5.1 sound, give it the 5.1 routing profile)

Instead all this info stays in the alsa driver writer head, and alsa dumps the control list on the end user hoping he can do something with it. In practice you'd better hope one of the default profiles works for you (and is honoured by all you apps) because that's as far as you're likely to get.

I suppose in the pro-audio world people can afford defining static audio pipelines manually, and don't need to switch between listening CDs, viewing DVDs, playing games, audio-conferencing but basic end-users need more hand-holding. Even trivial stuff like mapping mono sound to both stereo outputs does not happen in alsa world today. Users have no access to the power that may have them like alsa, in fact alsa makes it so hard to do anything often they can't manage basic stuff either.

Instead alsa doc focuses on helping users to break the distro kernel by replacing the perfectly sane (if misconfigured) drivers the distro ships with ones that won't work with the system alsa lib versions. I suppose when you have no tool to help users configure their setup switching to the latest version and hoping its defaults work better is all you can do. But it's not helping alsa go anywhere

Alsa is an end-user nightmare

Posted Dec 19, 2006 0:47 UTC (Tue) by jd (guest, #26381) [Link] (2 responses)

Ideally, you'd have four distinct APIs, rather than one ugly do-everything-badly API. The only "real" API would be the raw API, the rest would merely be wrappers that made it possible for humans on the planet Earth to actually write code and have people use it without their brains exploding. 'm thinking something along the lines of:

  • Basic API that provides access to (essentially) universally-present and (essentially) universally-used features. This would exclude all the fancy stuff, all the card-specific stuff, all the app-specific stuff. This is all you want for the system chimes, basic mixers, the bulk of DVD/CD players and the bulk of games. Lightweight, easy, maintainable, documented.
  • Extended API that provides access to stuff that's not in your basic realm, but the bulk of users WILL want these features enough of the time that you can't fob it off.
  • Advanced API that provides an abstract API to every other function that can sanely be abstracted. Anything that simply can't be abstracted or anything that is so unique that the abstraction is going to be identical to the specific case would not be in here. This is where you'd put all the REALLY fancy controls for professional sound apps, sound processing, etc.
  • Raw API that provides a pseudo-abstraction to access all of the functions on all of the cards. Absolutely everything can be done through a call to this and ultimately ALL sound operations will be done through a call to this. There is no finer-grain than a fully-exposed API. Great for development, great for totally specialist applications and embedded systems, great for raw speed, great for causing programmer brains to explode.

Alsa is an end-user nightmare

Posted Dec 19, 2006 13:32 UTC (Tue) by nim-nim (subscriber, #34454) [Link] (1 responses)

Grr, wanted to reply here and replied there (http://lwn.net/Articles/214858/) instead

Alsa is an end-user nightmare

Posted Dec 19, 2006 17:57 UTC (Tue) by jd (guest, #26381) [Link]

Unlike some of the managers I have the unfortunate luck to be saddled with, I have this nasty tendency to listen to people who point out flaws in my ideas. :)

I appreciate your comments and, yes, you are right that users will want full access to the cards. (And sometimes that will be plural. Less so these days, but it wasn't that long ago when if you didn't have both an LAPD1 and a Soundblaster, you didn't have a sound system worth a damn - and programs were written to expect you to have both.)

The tendency has been to push the intelligence into the libraries and drivers, so that apps writers describe what they want and the environment they run over does all of the logic to decide how best to achieve the results. This makes apps writing relatively easy, but does have the penalty that you're limited to what can be achieved (in some form or other) across the board. Multiple APIs you can blend as needed is an improvement, in that applications can choose to take back control on an as-needed basis, but it doesn't solve the problem that applications are still going to be limited to lowest common denominators for a lot of operations.

The more access you want, the higher up you need the intelligence. This is true for any hardware - the less you can assume, the more you have to decide. There are still levels of encapsulation you can do - SCSI vs. IDE, Ethernet vs. ATM, or whatever. This allows you to stack stuff together, so that you only need to bubble up what absolutely can't be done at a lower level, and it only needs to bubble up so far and no further.

The other thing to consider is that sound cards are generally presented as cards. One physical entity with a fixed number of physical channels associated with it. Almost no other device on the system is exposed in this manner. They are all exposed per virtual channel, where one virtual channel may encompass any designated fraction F of N physical channels across M physical devices. Applications just see the virtual channel, they neither need to know or care about the values of F, N and M.

Because sound is treated at a purely physical level, there is much less experience in virtualizing sound than for other systems. Maybe this is something that needs to be addressed.

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 17, 2006 12:38 UTC (Sun) by drag (guest, #31333) [Link] (5 responses)

Most of things that people dislike about Alsa have been solved a while ago.

for instance nowadays all sound cards support mixing by defualt. Either through hardware or by dmix. By default.

Sound cards are all configured automaticly by start up scripts.

Most people run into problems because you have things that like to use Artsd, or esound, or OSS.

Those things obviously don't benifit from anything from Alsa. So obviously Alsa is not going to make these things easy to deal with.

Not only are they difficult to deal with, but they interfer with proper operation of your sound system.

You can have as many instances of mplayer you want, totem, or anything else running.. but as soon as you start something that has been configured to use OSS then it's going to all turn to shit.

So a distro needs to configure libao to use alsa. They need to replace applications that do OSS-only with ones that can use Alsa. They need to replace esound with pulseaudio. They need to configure artsd to use alsa by default. etc etc.

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 18, 2006 18:24 UTC (Mon) by bronson (subscriber, #4806) [Link] (4 responses)

I don't know... The single largest problem that has plagued Alsa since day one -- usability -- is still a crippling problem.

Want about 20 minutes of entertainment? Ask your whiz-kid Mac-using little brother over to your computer and ask him to enable S/PDIF output. He'll hunts through multple GUI mixers looking in vain for something that says 'spdif'. You can laugh at his confuson when he realizes that the Gnome Sound control panel doesn't actually allow him to modify his sound settings.

Now, this kid is _smart_. You can watch in amazement as he figures out how to invoke lspci and finds out the PCI IDs of the card. He looks up the manufacturer and does some Googling. He finds instructions that require calling amixer. Not a problem, he knows how ot use command line. Too bad the instructions are from 4 years ago and silently fail to work anymore. It's about here, after 1/2 hour of thrashing, that any sane person will give up.

You can do this for any non-trivial sound setting. Enable 5:1 output? AC-3? 3D stereo? Endless hours of entertainment.

Alsa is flat out unusable. It's an embarrasment. On Mac and Windows, these settings are single check boxes inside an easy-to-find control panel. On Linux, unless you're looking to spend an hour learning the ins and outs of alsactl and editing obscure text files, these simple tasks are simply impossible.

So, yes, Alsa works as long as all you want to do is set output and mic levels. For anything else it still has a long, long way to go.

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 19, 2006 3:14 UTC (Tue) by zlynx (guest, #2285) [Link] (1 responses)

You're blaming the wrong thing here. What you want should be the job of Gnome or KDE. Would it make anyone feel better to rename the ALSA mixer to rawcardctl or something? Probably not.

Besides, the name works. Look at a real mixer board sometime. Sure, volume sliders make sense, the toggles for input/output mute make sense. But what does that knob with the bit of sticky tape next to it marked "SpecDir A2" do and what does that bit of tape mean? It's just as meaningless as the ALSA mixer, and without reading "Special Directions section A2" you'd have no more idea of what that real-life mixer control did than ALSA control "IEC958 Playback AC97-SPSA".

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 19, 2006 14:46 UTC (Tue) by otaylor (subscriber, #4190) [Link]

This post encapsulates exactly what is wrong with Linux audio: it has largely been developed by and for people's who's model for how sound should work is a mixing board with bits of tape stuck on it. :-)

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 19, 2006 7:02 UTC (Tue) by drag (guest, #31333) [Link] (1 responses)

Well aside from the fact you have alsamixergui and gnome-alsamixer which is a fairly intellegent front for doing mixing, I understand were you coming from.

The problem is this how it goes...

When you control using Amix or alsamixer these are the actual interfaces provided by the hardware itself. The names are pulled from the hardwar itself, I beleive.

If your 'ctl.!default' is hw:0 it realy should show the hardware mixing devices. I think this behavor is very correct. If a user wants to control the hardware directly then they should be able to.

But alsa does not nessicarially have to expose hardware directly to userspace. The way it's currently designed you can setup a asound.conf and asoundrc and then setup your own interfaces through the configuration of various plugins.

For instance the most common one is dmix, which is a software mixing plugin for allowing many applications to play sound at once. So instead of going directly to the hardware the sound application goes directly to the dmix, which then sets up a audio format and does the mixing then sends that to the hardware.

Well it not only works for sound output, but you can do that for mixer controls also. You can intercept mixer controls and present a abstracted image to alsa applications. For example when I am using 'pulse' plugin all the hardware mixing interfaces are removed and replaced with a single volume control when I open up alsamixer.

Now the reason why it's so difficult to get spdif working is that each and every sound card has a different way of doing it. ac'97 sound cards, intel-hd sound cards, emu10k1 sound cards, ice1724 sound cards, etc etc. all have different hardware interfaces.

So we need to figure out a way to use alsa and it's plugins, to setup a idealized mixer interface to user space.

One possible way is to have data files provided with every driver. These files would provide details in a xml format or something that can be automaticly used to generate a asound.conf file to provide several main tasks.

For instance you could possibly set it up so that for desktop integration your require '5 tasks' that must be profiled for each sound card.

Using these tasks then a idealized mixer interface is exposed via the Alsa API.
A. Master volume control.
B. Mixer volume control.
C. Sound input selection. (line-in, mic, mic-boost, or spdif-in)
D. Sound output selection. (stereo out, surround out, or spdif out)
E. Special features. (such as 3d effect)

I don't know, but I am guessing that that would be sufficient mixer interface for 90% of the users. If the sound card or driver does not support that feature then it shouldn't be present.

Then with the '5 tasks' choosen users with aviable hardware can create a configuration for each the tasks and send them to alsa which may be included with a alsa-data tarball or something. This way developers if they make a change to a sound card they can edit this file and it wouldn't disrupt user-land. People having problems can then work with distros to provide a superior task to replace the one they have trouble with.

Then for 10% of the users were this is not good enough they can then set their defualts to 'hw:0' and do it like we do now.

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 22, 2006 16:29 UTC (Fri) by JohnNilsson (guest, #41242) [Link]

An other step could be to formalize the hardware description. Something along the lines of how XKB defines keyboard geometry.

This way GUIs could at least provide something more sensible than alsamixer when you need to mess with the details of the hardware.

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 17, 2006 16:00 UTC (Sun) by petegn (guest, #847) [Link] (3 responses)

Never mind sound that has always worked bettin Linux than windBloZe for me what about getting darn WiFi Right now there's a big bucket of worms that is still next to flippin useless difficult to connect seems to have a mind of it's own when it comes to interface nameing WEP keys are a total pain (Kwallet SUCKS) and switching from one wirless network to another is almost impossible without a reboot .

Yet in windBloZe just change the connection and bingo one out one in no friiiging in the rigging .

I aint affraid of the CLI in fact a good 60% of my work is in CLI mode in an xterm far better than any GUI . but WiFi do me a favour get it sorted folks it SUCKS BIG TIME ...

JOe sales rep aint gunna want to start digging around in his system to get a net connection justa cus he moved next door it should be simple and intuative and that it most certainley aint ..

Pete .

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 18, 2006 0:32 UTC (Mon) by drag (guest, #31333) [Link] (2 responses)

try network-manager

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 18, 2006 8:23 UTC (Mon) by petegn (guest, #847) [Link] (1 responses)

total crap tried it about as much use as a chocolate teapot in a transport Cafe ..

No the problem is Linux WiFi it need very major improvements and fast

I dont Play at Linux during the day i WORK using Linux therefore clowning around is a NoNo it needs fixing ..

Cheers

Pete .

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 18, 2006 15:30 UTC (Mon) by drag (guest, #31333) [Link]

Well I am sure that your explaination of what is wrong will prove very usefull for everybody involved.

Personally I WORK with linux and my wifi WORKS quite well.

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 18, 2006 11:33 UTC (Mon) by tialaramex (subscriber, #21167) [Link] (6 responses)

End-user audio software is nearly always by its nature real time software. This means it is actually hard, and all the projects that try to hide that from their own developers will have insurmountable problems, often blaming the kernel developers, the hardware or even the users for their mess.

Our objective has to be to make this easy for the /users/ rather than hoping that we can make it easy for programmers, which it just isn't.

There is some relatively low hanging fruit. So far every "friendly" GUI mixer interface I've seen has been substantially worse than ALSA's sample alsamixer. There is no point in a half-solution here. If the controls don't reflect reality or some of them are missing or don't work because of missing features in your software then it's /worse/ than not having the software at all. If your desktop project, media system or whatever has a mixer that doesn't implement all of ALSA's control API then it's useless, get someone to fix it as their top priority or deprecate it and stop distributing it.

Next to listening to stereo music, the next most common thing users are trying to set up is VoIP. We can and should provide desktop software (not OS install stuff for administrators, real end user software in the menu next to their music player) which helps the user to get their hardware and mixer settings right for both these applications. That means first of all testing their headphones / speakers, and then doing a software loopback test. The software needs to be accompanied by comprehensive trouble-shooting advice, not only about software issues like mixer controls, but also about plugging things in correctly, positioning etc.

I hope McQuillan understands that it is unlikely that a "single robust API" will be a success. Other platforms have had a lot of trouble achieving that goal, and we are probably no different. You will not get the average C# hacker to write software that meets the JACK realtime requirements. On the other hand the JACK users won't put up with the high latency and terrible jitter of a "write big buffers and then block" API written for those C# hackers.

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 18, 2006 15:55 UTC (Mon) by drag (guest, #31333) [Link] (5 responses)

The trouble is that each audio card is going to be different.

So that the mixer controls and settings for one card in alsamixer isn't going to work for other sounds cards.

For example take these three cards that have good support in Linux:
emu10k1 driver for my first gen Creative Audigy.
ice1712 driver for my M-Audio Audiophile 2496.
intel8x0 driver for my old nforce2 board.

All three of these use vastly different ways to set recording over mic or line-in.

With the intel8x0 it's fairly standardized since it's a generic driver. You hit 'space' on the item you want to record from then you fiddle around with the mixers till it sounds good.

Now that won't work for either the ice1723 or emu10k1 driver. With the emu10k1 there is a selection you have to go into to choose the device to pick from then the mixers are different from the intel8x0.

With the ice1723 I am not even capable of using alsamixer with it. It's totally different having a different set of I/O then a normal sound card. (it's more a 'pro-sumer' class card). It has it's own paticular mixer application you have to use to set the settings for it.

So then there is more things you have to deal with.

For instance setting SPDIF out on these guys are all very different. Setting up surround sound are going to be different.

So although alsa does a good job exposing functionality to users in a way that OSS could never do (without a crapload of hacks).. and the plugin architecture is very nice it still is difficult for end users to deal with because they have to have good knowledge of the paticular sound card since it's very different for each sound card.

What is needed is a way to expose functionality to users in a abstracted way so that it's standardized. That way then you can write documentation and GUI applications in such a way that it makes sense to naive users.

Now this has to be oriented for _desktop_ users.

For audio workstation then that's going to be different set of problems.

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 19, 2006 9:27 UTC (Tue) by nim-nim (subscriber, #34454) [Link] (4 responses)

IMHO you're still thinking as an app writer.

App writers don't want to mess with card differences so they're asking to simulate a stupid card in all cases (the KDE thing is in a step in this direction), and pretend we still use the OSS soundblasters of last century.

Users OTOH do want to take advantage of the card they shelled out precious money for. If it has twenty output/outputs they *want* access to the twenty outputs (and apps just have to live with it).

Now access does not mean a long list of raw controls labelled in english sound engineer tech-speak (i18n audience remember) with no explanation how they relate with each other (Alsa insisted in calling the main volume knob on one of my past cards "DAC volume", it may be right technically but what's bloody use is it for a user).

Access means users can easily test each I/O plug and label it with the name of the device plugged there, and have this device name appear in apps (to take a bad analogy : you don't expect users to select network printers by IP in office apps, but that's exactly how the alsa UI works). Access means giving users a pretty graphical representation of their chip pipeline, so they know foo control works on bar output and are not required to test them one after the other till they stumble on the one that works. Access means renaming controls depending on the input/output chosen, instead of requiring users to remember that in a particular mode a particular slider must be used. Access means selecting a default volume baseline for each input/output, to take into account all devices have not the same loundness level. Access means automatically choosing the best chip mode depending on the output device & input stream.

Most of this BTW does not need to happen in apps, video/sound apps only need to expose the user-labeled I/O list, not reconfigure all the card every time (that ekiga needs to perform some of these steps itself for example is an indictement of the sound system infrastructure state)

And it's not rocket science, any set-top DVD reader with more than one output has been doing it for years (and they used to say VCRs are a UI nightmare).

Also I don't get the "every chip is a particular case" argument. Other hardware subsystems somehow manage to handle their particular range of diversity way better. Instead of (at best) cloning every particular proprietary windows control app without thinking about commonalities, alsa writers would be well advised to take the best parts of each of them and create a generic user-friendly control app. That would make basic users happy. Make users happy and app writers will follow suit.

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 19, 2006 14:57 UTC (Tue) by pizza (subscriber, #46) [Link] (3 responses)

>Also I don't get the "every chip is a particular case" argument. Other hardware subsystems somehow manage to handle their particular range of diversity way better. Instead of (at best) cloning every particular proprietary windows control app without thinking about commonalities, alsa writers would be well advised to take the best parts of each of them and create a generic user-friendly control app. That would make basic users happy. Make users happy and app writers will follow suit.

Do you know how many different (and subtly-incompatible) AC97 codecs there are? There is no "standard AC97 device" but rather a bazillion different codecs that differ enough as to make the "standard" almost meaningless. (And it's said codecs that provide the actual mixer interface!)

Every model of audio board (including different motherboards with the same audio chipset) uses a different proprietary windows driver, mapping the hardware-specific stuff to the generic windows mixer, and occasionally providing a proprietary mixer to drive chip-specific stuff in a completely non-standard manner.

Because of this "standard mapping via a proprietary driver" windows just works for basic audio in/out. But ALSA doesn't have the benefit of every motherboard maker and every board maker creating that shim layer; ALSA has to "just work" and automagically detect everything, soft-ports and all. It does a remarkably good job given the nightmarish task expected of it.

>Access means users can easily test each I/O plug and label it with the name of the device plugged there, and have this device name appear in apps

This is an excellent idea, and I believe it's been possible for some time now. All we need is some enterprising individual to devote considerable hair-loss to writing an app to do all of this, and maintaining a database of every card and motherboard under the sun.

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 19, 2006 15:43 UTC (Tue) by nim-nim (subscriber, #34454) [Link] (2 responses)

To give some perspective, I believe the variability of printing devices (from el cheapo inkjets to departement colour printers with special papers types, input trays, stappling, resolution, job format, network protocol...) for example is at least as big as the one of sound chipsets.

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 19, 2006 18:42 UTC (Tue) by pizza (subscriber, #46) [Link] (1 responses)

It's nowhere comparable, because as far as the app is concerned, it just spits out a batch of generic postscript that gets automagically translated/redirected/etc to the printer by the backend. It also doesn't happen in realtime and doesn't allow multiple users to access it simultaneously. (hence the term "print job")

All an app really needs to print is the paper type (to get the size, margins, etc.) Everything else is just an opaque knob that's passed down to the backend unchanged, embedded in the postscript file.

Additionally, the printer configuration/dialog boxes are also not provided by the app, or even the system printing backend, but rather the desktop environment/libraries which is aware of several backends yet hides all of this from the end-user.

So if you're going to compare this to ALSA, you need to harass GTK/Gnome or Qt/KDE to provide a single thunk/configuration layer when it comes to sound. And I believe that's exactly what TFA claims is the point -- right now there are several capable layers, but they don't coexist.

Meanwhile, if you really want compare ALSA with printing, you'll need dialog boxes to allow you to automatically reroute two automatic document scanners to four different printers, while translating your 3-D imaging system to a CNC milling machine (and a hologram printer, oh, and also on the other four printers). Naturally every bit of equipment is a different brand/model, but we should being completely aware of the capabilities, resolution, speed, cost, and various knobs of each and be able to visualize every possible combination. Oh, and doing it all in hard realtime, changing/scaling/morphing the output based on sliders you're moving on your desktop while combining this with the output of five different applications that want to use two of your four printers, one of your scanners, and that CNC machine, all at the same time.

Or do you "just want to print?" *chuckles*

Linux printing used to be "a mess", but it wsn't the low-level stuff that sucked -- it was the higher-level (desktop-level) interfacing with the low-end stuff that was terrible. It took a lot of hard work to build that high-level stuff, including modifying the low-level stuff to more easily expose its knobs in a generic manner.

Sound is much the same way now, except Linux's low-level stuff (ALSA) is perhaps the most capable API out there, doing nearly everything for nearly everyone -- but because every soundcard is different, we have no idea what "sane defaults" are for many situations/tasks, and that's the crux of this mess. Mapping these raw hardware controls to a "desktop task/profile" is the job of the desktop environment.

Linux desktop architects map out plans for 2007 (Linux.com)

Posted Dec 19, 2006 19:22 UTC (Tue) by nim-nim (subscriber, #34454) [Link]

Don't take it bad, but if ISVs complain about Alsa that's because they're not getting frantic requests from their users to support it, and if users don't care that's largely because Alsa never made their life better/easier.

Alsa people have this strange notion Alsa will sell itself to app writers and users, despite offering little or no benefits so far, and lots of pains (API complexity, config complexity). Or alternatively someone else will make everyone grok Alsa goodness.

The problem is neither GTK/Gnome, nor Qt/KDE, nor ISVs have any reason to promote Alsa in its stead. (I won't even talk about other sound projects graciously giving up so there's only the Alsa API left to choose). When ESP pushed cups to replace lpr/lpr-ng it proposed a great end-user front-end to prove its solution goodness. Fluendo is writing Elisa to sell gstreamer. What did the Alsa project ever do to demo its baby?

Like everyone else, I like the bright sound future Alsa paints. And then I get home late after work, have to fight the damn Alsa controls to do the simplest things, and suddenly Alsa is not my darling anymore.

(I won't write about the print analogy anymore, because obviously you have little knowledge about that part, and I'm not here to flame)


Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds