May 26, 2004
This article was contributed by Dave Fancella
The full text of this interview (much longer) is available
here.
As a long time musician, or so I like to call myself, and a free software enthusiast, I have personally found Audacity to be an indispensable tool for mastering mixes. Other people find a variety of uses for it, including deployment into public radio stations, restoring LPs for CD-burning, and more. Audacity has been in continuous development since 1990. It is a multi-track recorder, mixer, wave form visualization tool, and editor all rolled into one. It's the Free Software equivalent of Protools, Soundforge, and Cakewalk, albeit without the midi portion of any of those programs.
Recently I exchanged some email with Dominic Mazzoni, the Lead Developer and founder of Audacity. As a long-time lurker on the Audacity-devel mailing list, I've come to be familiar with Dominic as one of those kind, gentle spirits who leads first with his coding and second with his ideas, and is an inspiration to us all. Here, then, is the email interview with Dominic Mazzoni:
Q:
In the Audacity FAQ, it says "Audacity was started in the fall of 1999 by Dominic Mazzoni while he was a graduate student at Carnegie Mellon University in Pittsburgh, PA, USA. He was working on a research project with his advisor, Professor Roger Dannenberg, and they needed a tool that would let them visualize audio analysis algorithms. Over time, this program developed into a general audio editor, and other people started helping out." Would you provide some information on the nature of the tool? How did it turn into a general audio editor? Was it a graded assignment, and if so, what grade did you get?
A:
I was in a Ph.D. program at CMU, and the way it works there is that grad students are supposed to work on independent research right from day one, even while we're taking classes. My dream was to develop automatic music transcription software that could take any recording and turn it into sheet music. This was too difficult, of course, so I was working on monophonic pitch transcription and melody matching, which eventually led to some reasonably successful research in how to retrieve a melody from a database of songs based on a sung/hummed query. While I was trying to visualize pitch transcription algorithms, I started developing my own tool. Since there weren't any other audio editors for Linux that I liked, and I couldn't afford any good editors for the Mac (my two preferred platforms), I thought it would be fun to turn my project into a complete editor.
My advisor, Roger [DF: Roger Dannenburg is the mastermind behind the Nyquist scripting library which is now embedded in Audacity and is just one way to extend Audacity's sound processing capabilities], was very supportive of the project, and convinced me to turn the editor into a Computer Science research project. So I came up with an interesting data structure that could do editing operations quickly, and we wrote a paper on it. By the end of that year, though, I was having a lot of fun with the audio editor and was spending more and more time on it outside of my official research. I came up with the name "Audacity" and released it on Sourceforge. It was pretty limited at the time, but it was cross-platform, which was a big deal, and it worked well enough to generate interest. From that point on I worked on it mostly as a hobby, rather than as a part of my research, though I did find it useful for my research, too.
Q:
Audacity has been gaining a lot of traction in the market, lately. How do you feel about that? Do you ever get the "15-minutes of fame" feeling, or is it something you ever really think about?
A:
I've thoroughly enjoyed all of the attention that Audacity has gotten. I enjoy working on something that people find useful, and I would choose fame over fortune any day. I've invested so much time into Audacity that it can affect me pretty seriously - seeing a good review or getting an email full of praise can give me an emotional high that lasts all week, but unfortunately bug reports, especially serious ones where people have lost work because of a bug in Audacity, can really make me feel depressed. Recently I had to take a step back and give myself a vacation from responding to emails to audacity-help for my own sanity (thankfully, other developers and users have done a great job of answering the mail).
Q:
How do you feel, and how do you respond when users show up that want specific features found only in specific commercial applications?
A:
Actually I don't think that anyone has ever said they wouldn't use Audacity if it didn't work exactly like their favorite proprietary application. Most people are perfectly happy to do things a different way as long as it's equally intuitive and powerful. Sometimes we're able to satisfy users by making Audacity as customizable as possible - for example you can edit all of Audacity's keyboard shortcuts and make them the same as some other program if you want. The other Audacity developers and I came up with our own keyboard shortcuts based on what we thought would be the most intuitive and useful, but users are free to modify that (and they can even save their keyboard layouts as XML and share them with other users).
I've been a Mac user since the very beginning (my parents bought an original Macintosh in 1984) so I've always been a fan of intuitive, "discoverable" interfaces. My main complaint with other audio editors is that too often they are trying to emulate the interfaces of analog mixing boards, which I didn't think was very intuitive for the rest of us. I wanted to create an interface that anyone computer-literate could figure out how to use on their own.
Q:
For that matter, even the digital mixing boards are trying to emulate the analog interfaces when they don't really have to. :) Are there any specific areas where you think Audacity could really take advantage of the fact that it's software for a general use computer to make some really nice interface?
A:
There are lots of areas where an audio editor could be "smarter" than it is now to save users time. I'd like to see Audacity do automatic beat detection and have an option to snap the selection to the nearest beat boundary, making it easier to cut an entire chorus out of a song without breaking the tempo, for example. I'm sure there are hundreds of other things like that.
If you look closely, you'll see lots of subtle differences in the way that Audacity operates. Unlike almost every other audio program I've seen, Audacity lets you have multiple tracks, each with a different sample format (16-bit/32-bit) and sample rate (44100 Hz, etc) - and Audacity automatically mixes them on the fly. It also has a rather unique built-in amplitude envelope editor, and one of the best frequency analysis views.
Q:
How would you define Audacity's target market?
A:
Well, it's free, so everyone. Seriously. I'd like Audacity to be good enough to meet the needs of 90% of the users who just want to record a song or an interview, create a mix, convert a tape or LP to CD, etc. Then for everyone who has more advanced needs than that, there are plenty of other tools available - but there's no reason not to keep Audacity around also for the few things that Audacity might do best.
Audacity is a particularly good choice when it's helpful to have a truly cross-platform tool, such as in a mixed-operating-system school computer lab - or when the licensing cost of other tools is prohibitive, such as in third-world countries or at public radio stations.
Q:
I understand that Audacity uses a block file approach, where instead of manipulating each track as one large file you guys have broken each track down into many small files. Would you tell us more about this setup? Why did you chose it over other methods? What are the benefits and drawbacks with using block files?
A:
Well, to be honest, when I started Audacity I didn't know about Edit Decision Lists. My only experience was with tools like SoundEdit and (early versions of) CoolEdit, both of which were very slow at doing things like Cut, Copy, Paste, and Undo, because they rewrote the entire audio file on disk after each operation.
Q:
How about some more information on Edit Decision Lists?
A:
An edit decision list is a list of all of the modifications you made to the original audio. The original audio file is left alone, and when you press play, the computer applies all of the edits in real-time to render the audio. This makes editing very fast, since the program is just manipulating a list of edits, but it can increase the amount of processing power required to playback audio in real-time. These days, though, you can do hundreds of edits before you even begin to slow down a modern PC.
I knew I could do better using my Computer Science knowledge, and soon I had worked out a method that involves splitting each track into small pieces - say about 2 MB each. If you allow each piece to be any size from 1 MB to 2 MB, but no smaller or larger, then it turns out you can implement all of the basic editing operations (cut, paste, etc.) without ever having to modify more than 5 pieces ("blocks") at a time. This was what I ended up writing a paper on.
In doing the research for the paper, I learned about Edit Decision Lists and other techniques for nondestructive audio editing. In the end I decided while there were some advantages to EDLs, there were just as many advantages to the blocked-file approach, so it would be better to keep Audacity unique and capitalize on the strengths of this approach, rather than switch to EDLs just to copy everyone else.
One advantage of the blocked-file approach is that you can have multiple "references" to the same data in multiple places. So duplicating a track in Audacity, or creating a loop (using the Repeat effect), are both virtually instantaneous. Also, because Audacity never splits files smaller than about a megabyte, it doesn't slow down trying to playback a region that contains hundreds of edits, which can be a problem with EDL-based editors.
Q:
More recently, there has been a bit of buzz over a new back end implementation of Audacity's work code in a library that has been named "Mezzo". Would you tell us a bit about Mezzo?
A:
We've been talking about something like Mezzo for years, but Joshua Haberman (one of the earliest Audacity developers) and I finally started working on it a couple months ago. We did a lot of redesigning and rewriting together early on, but now that we're mostly happy with the new design, Joshua has been doing most of the work.
Mezzo is a rewrite of all of the major core features of Audacity aside from the graphical interface. While Audacity is distributed under the terms of the GNU General Public License, which means that the source code can only be borrowed for use in other GPL or GPL-compatible programs, Mezzo will be released under a very unrestrictive BSD-like license that will allow it to be used by almost anyone. We hope that this will encourage many more people to use Mezzo in projects unrelated to Audacity, including commercial products, which will lead to Mezzo being much more robust and stable.
Well, thank you very much Dominic for your time, both in this interview and your time spent bringing us Audacity. It definitely fills a hole for many of us, and as usual, there isn't really any way to properly thank you other than continuing to use and support Audacity.
Audacity can be found at
audacity.sourceforge.net. Information on Mezzo can be found in the
Audacity Wiki.
(
Log in to post comments)