Interview with Audacity developer Dominic Mazzoni
Recently I exchanged some email with Dominic Mazzoni, the Lead Developer and founder of Audacity. As a long-time lurker on the Audacity-devel mailing list, I've come to be familiar with Dominic as one of those kind, gentle spirits who leads first with his coding and second with his ideas, and is an inspiration to us all. Here, then, is the email interview with Dominic Mazzoni:
Q: In the Audacity FAQ, it says "Audacity was started in the fall of 1999 by Dominic Mazzoni while he was a graduate student at Carnegie Mellon University in Pittsburgh, PA, USA. He was working on a research project with his advisor, Professor Roger Dannenberg, and they needed a tool that would let them visualize audio analysis algorithms. Over time, this program developed into a general audio editor, and other people started helping out." Would you provide some information on the nature of the tool? How did it turn into a general audio editor? Was it a graded assignment, and if so, what grade did you get?
My advisor, Roger [DF: Roger Dannenburg is the mastermind behind the Nyquist scripting library which is now embedded in Audacity and is just one way to extend Audacity's sound processing capabilities], was very supportive of the project, and convinced me to turn the editor into a Computer Science research project. So I came up with an interesting data structure that could do editing operations quickly, and we wrote a paper on it. By the end of that year, though, I was having a lot of fun with the audio editor and was spending more and more time on it outside of my official research. I came up with the name "Audacity" and released it on Sourceforge. It was pretty limited at the time, but it was cross-platform, which was a big deal, and it worked well enough to generate interest. From that point on I worked on it mostly as a hobby, rather than as a part of my research, though I did find it useful for my research, too.
By the way, I never finished my Ph.D. - I decided I need a break from grad school after a couple of years so I moved back to California to work for a while. I received my Master's degree from CMU last year after completing an additional course on my own time.
Q: What do you do when you're not working on Audacity? Family? Kids? I seem to remember reading somewhere that you were working out at NASA Jet Propulsion Laboratories. Do you program there, or actually build rockets?
I'm single, but I try to keep an active social life. When I'm not at JPL or working on Audacity, you might find me playing the piano in a jazz band, cooking vegetarian food, ballroom dancing, playing board games, or riding my unicycle.
Q: Audacity has been gaining a lot of traction in the market, lately. How do you feel about that? Do you ever get the "15-minutes of fame" feeling, or is it something you ever really think about?
Q: Do you envision a day when you might turn away from Audacity as Lead Developer and pursue something else? Do you visualize yourself in your retirement, with a wheelchair, oxygen tanks, and a laptop still hacking on Audacity (or something to that effect)?
If somebody else wanted to take over as the project leader, I would probably let them (assuming I think they'd do a good job, of course). Until then, I'll put in as much time as I can and don't plan to retire anytime soon if nobody else would be taking over.
Q: Occasionally, users post thank-you messages to the developer's list, thanking you and the other developers. Frequently they provide links to their own music that they recorded and mastered with Audacity. Do you ever follow the links? If so, what kind of diversity do you find among the users of Audacity?
Q: It can be said that many Free Software projects are just Free as in Speech knock-offs of popular commercial applications. Is Audacity just a knock-off of commercial applications?
Q: So I take it this is a 'no'?
Q: How do you feel, and how do you respond when users show up that want specific features found only in specific commercial applications?
I've been a Mac user since the very beginning (my parents bought an original Macintosh in 1984) so I've always been a fan of intuitive, "discoverable" interfaces. My main complaint with other audio editors is that too often they are trying to emulate the interfaces of analog mixing boards, which I didn't think was very intuitive for the rest of us. I wanted to create an interface that anyone computer-literate could figure out how to use on their own.
Q: For that matter, even the digital mixing boards are trying to emulate the analog interfaces when they don't really have to. :) Are there any specific areas where you think Audacity could really take advantage of the fact that it's software for a general use computer to make some really nice interface?
If you look closely, you'll see lots of subtle differences in the way that Audacity operates. Unlike almost every other audio program I've seen, Audacity lets you have multiple tracks, each with a different sample format (16-bit/32-bit) and sample rate (44100 Hz, etc) - and Audacity automatically mixes them on the fly. It also has a rather unique built-in amplitude envelope editor, and one of the best frequency analysis views.
Q: How would you define Audacity's target market?
Audacity is a particularly good choice when it's helpful to have a truly cross-platform tool, such as in a mixed-operating-system school computer lab - or when the licensing cost of other tools is prohibitive, such as in third-world countries or at public radio stations.
Q: If I recall correctly, Audacity still doesn't play well with other platforms when exchanging project files across platforms. Is that still true?
Q: What innovations do you think Audacity brings to the table?
I also think that Audacity has lots of subtle innovations hidden in surprising places. For example, when you select a note in Audacity and then open the "Change Pitch" dialog, Audacity analyzes the selected audio and automatically fills in the fundamental frequency in the "from" box, letting you type in something else in the "to" box. Another innovation is the "Import Raw" command that can automatically figure out the format of any (uncompressed) audio file, even in a weird unsupported format, by determining what interpretation of the bits results in the most continuous-looking signal. I also believe that the double-handles on the amplitude envelopes (which let you amplify a signal beyond 100%) and the multi-mode tool are all innovations that are unique to Audacity.
Q: Have you had to make many compromises in Audacity's interface just so it can be consistent with specific platform expectations that differ across platforms?
Audacity would definitely be easier if we only had to support one platform. Windows has a nice toolbar widget that we could have used instead of writing our own. Mac OS X has a great preference dialog we could have used. And on Linux there are lots of libraries we could link to that would provide all sorts of useful functionality, but it wouldn't be cross-platform.
Q: I understand that Audacity uses a block file approach, where instead of manipulating each track as one large file you guys have broken each track down into many small files. Would you tell us more about this setup? Why did you chose it over other methods? What are the benefits and drawbacks with using block files?
Q: How about some more information on Edit Decision Lists?
I knew I could do better using my Computer Science knowledge, and soon I had worked out a method that involves splitting each track into small pieces - say about 2 MB each. If you allow each piece to be any size from 1 MB to 2 MB, but no smaller or larger, then it turns out you can implement all of the basic editing operations (cut, paste, etc.) without ever having to modify more than 5 pieces ("blocks") at a time. This was what I ended up writing a paper on.
In doing the research for the paper, I learned about Edit Decision Lists and other techniques for nondestructive audio editing. In the end I decided while there were some advantages to EDLs, there were just as many advantages to the blocked-file approach, so it would be better to keep Audacity unique and capitalize on the strengths of this approach, rather than switch to EDLs just to copy everyone else.
One advantage of the blocked-file approach is that you can have multiple "references" to the same data in multiple places. So duplicating a track in Audacity, or creating a loop (using the Repeat effect), are both virtually instantaneous. Also, because Audacity never splits files smaller than about a megabyte, it doesn't slow down trying to playback a region that contains hundreds of edits, which can be a problem with EDL-based editors.
Q: So does Audacity use a Copy On Write method, then? Or is it somewhat beyond something as rudimentary as Copy On Write?
There are definitely some problems with the approach, though, that we're trying to work around. The idea that an Audacity project is actually a file plus a data folder is confusing to many people at first.
Q: More recently, there has been a bit of buzz over a new back end implementation of Audacity's work code in a library that has been named "Mezzo". Would you tell us a bit about Mezzo?
Mezzo is a rewrite of all of the major core features of Audacity aside from the graphical interface. While Audacity is distributed under the terms of the GNU General Public License, which means that the source code can only be borrowed for use in other GPL or GPL-compatible programs, Mezzo will be released under a very unrestrictive BSD-like license that will allow it to be used by almost anyone. We hope that this will encourage many more people to use Mezzo in projects unrelated to Audacity, including commercial products, which will lead to Mezzo being much more robust and stable.
Q: Are there any plans to support Mezzo with bindings to other languages, such as Python or Perl? On a related note, if you take all the work code out of Audacity and leave just the GUI, what are the potential ramifications to Audacity itself if you were to look at a GUI-oriented language, such as wxPython, in order to facilitate GUI development? Would the C++ dependency be lessened enough to make it feasible to switch the GUI to a different language, one that arguably would in fact facilitate development?
I think it will be years before we could consider creating the GUI for Audacity in wxPython. But much sooner than that I'm sure that somebody could create a very simple prototype editor using Mezzo and wxPython. That would be a fun exercise. I was thinking of writing a very simple text-only audio editor sometime just for fun.
Right now the biggest advantage of Mezzo over the equivalent part of the Audacity code is that Mezzo is much cleaner and easier to read. We've already started to add some enhancements, though, like more flexible blocked-file formats.
Q: Any ideas how soon we'll get a release that uses Mezzo?
Well, thank you very much Dominic for your time, both in this interview and your time spent bringing us Audacity. It definitely fills a hole for many of us, and as usual, there isn't really any way to properly thank you other than continuing to use and support Audacity.
Audacity can be found at
audacity.sourceforge.net. Information on Mezzo can be found in the
Audacity Wiki.
Index entries for this article | |
---|---|
GuestArticles | Fancella, Dave |