|
|
Log in / Subscribe / Register

Refining the Process of Digitizing Vinyl Records

By Forrest Cook
December 23, 2008

In October, your author discussed the process of digitizing vinyl records for the creation of a digital audio library. Since that time, the process has been performed on around 40 disks and a number of refinements have been made. This article discusses what has been learned in that time.

One part of the digitizing process that has proven to work well involved treating one side of the original media as a single chunk of data. Many of the processing steps can be performed on these large data chunks before splitting up the individual tracks.

[Audacity Overrun]

After making numerous recordings, it was discovered that a single record level, 93 on the inputs of the M-Audio Delta 44, consistently produced recordings with a useful volume range on the majority of the records that were copied. An interesting phenomenon was observed with some recordings that were recorded with too much gain. On loud passages, as the waveform reached the upper or lower limit (rails in electronic-speak), instead of just flattening out, a complete inversion of the wave would occur, resulting in harsh sounding rail-to-rail glitches. The source of the problem is open to speculation. If this should occur, it is best to make a new recording of the album side with a lower input level.

Having two machines handy has helped to optimize the audio processing work. One machine is dedicated to making the initial album side recordings. The sides are minimized in size by removing data before and after the recorded audio starts, and fade-ins and fade-outs are added to whole album side. The album sides are copied to another machine with a faster processor for further processing. The original copy is kept around as a backup until the side has been fully processed. After copying the recorded album side to the secondary machine, a new recording can be started on the recording machine.

The process of removing clicks and scratches from an album side has seen the most changes since the original article. This is a bit of a learned art. The first step now involves visually inspecting the waveform of the album side with Audacity. Often a few huge spikes will be visible on the recording. They can be removed by repeatedly selecting an area and zooming in until the zoom resolution shows individual samples as dots. The repair operation should be performed on all of the large clicks. Smaller clicks can often be found and removed by zooming into the quiet passages, an almost infinite amount of of hunting, zooming and repairing can be done.

Another good way to find clicks is to listen, pause, remove and move on. Most tracks can be cleaned up to a reasonable level without too much effort. Some albums can contain an incredible number of clicks while others can be nearly click-free. After the manual deglitching is done, the automated click removal step can be performed. This is now optional, but it can find additional clicks that are buried in busy waveforms.

After whatever amount of declicking seems reasonable, the audio is exported from Audacity as a .wav file. Before exiting Audacity, the Stereonorm script (available here) is run on the .wav file to bring the left and right channel levels up to 100% volume. If the normalization results look reasonable compared to the Audacity visual representation of the recording, Audacity is exited and restarted with the normalized recording. If the normalization numbers seem right compared to the visual wave representation, it is often possible to remove more offending large clicks, export again and rerun the normalization step. Although it may make audiophiles cringe, it may be beneficial to use the repair function to shave the level off on the peaks of loud percussive waveforms. Done sparingly, this can be used to fix balance problems encountered during the normaliztion step.

The version of Audacity that your author has been using, 1.3.4-beta on Ubuntu 8.04, has a few bugs that can cause crashes and the loss of time-consuming work. Occasionally after doing a lot of repairs, attempting to export a file as .wav produces a long stream of zero-length write errors. It is usually possible to recover from this by writing out the data in the Audacity native .aup format, exiting and restarting Audacity with the .aup file, and trying the .wav export again. On numerous occasions, adding a label track followed by doing more click repairs has caused Audacity to crash. It is advisable to perform the labeling step on a new instantiation of Audacity. Hopefully these bugs to disappear when the system gets updated to a newer version of Audacity.

After investing many hours into the creation of a large audio library (now up to around 200GB), it becomes critical to back up the data. Fortunately, the price of IDE disks has dropped as fast as the capacity has risen and hard drives can be treated as high capacity data cartridges. Backups can easily be done by adding a temporary SATA or USB drive to a system and running an efficient rsync operation to copy any new or changed data to the offline archive.



to post comments

Refining the Process of Digitizing Vinyl Records

Posted Dec 25, 2008 2:08 UTC (Thu) by pr1268 (guest, #24648) [Link] (12 responses)

An interesting phenomenon was observed with some recordings that were recorded with too much gain. On loud passages, as the waveform reached the upper or lower limit (rails in electronic-speak), instead of just flattening out, a complete inversion of the wave would occur, resulting in harsh sounding rail-to-rail glitches.

This behavior is likely caused by an integer over/under-flow issue in the PCM WAV representation used by a lot of audio editors. Just my intuitive thinking here...

My experience using audio-editing software is chiefly with Goldwave (http://www.goldwave.com). Granted, it's Windows-only software, but older versions (e.g. 4.26) work flawlessly in Linux with Wine. In an identical situation to Forrest's, Goldwave will clip the waveform in a "flat" fashion (like a plateau, for lack of a better analogy), but not oscillate like shown here.

Thanks for the article; I enjoy discussing audio editing on computers!

Refining the Process of Digitizing Vinyl Records

Posted Dec 26, 2008 10:30 UTC (Fri) by magnus (subscriber, #34778) [Link] (11 responses)

I think this phenomenon could come from inside the A/D converter itself, ADC:s can act strange when getting out-of-range voltages. The M-Audio board probably has less protection from this than standard consumer audio HW.

I used to fiddle a bit with Goldwave too, during my Windows days back in the 90's..

Refining the Process of Digitizing Vinyl Records

Posted Dec 26, 2008 18:13 UTC (Fri) by boog (subscriber, #30882) [Link] (10 responses)

The problem could indeed be electronic. The graph in the image to the right of the relevant text is a bit ambiguous (what is represented? units??), but if it shows what should be a large positive voltage becoming negative and vice versa, you could be seeing "phase inversion". This is a fairly common phenomenon observed when FET-input operational amplifiers are over-driven. So in principle it could be occurring anywhere between the preamp and the analog to digital converter. As the signal would in any case be saturated without the phase inversion (and therefore distorted), your only option is to reduce the gain.

Refining the Process of Digitizing Vinyl Records

Posted Dec 27, 2008 1:36 UTC (Sat) by pr1268 (guest, #24648) [Link] (9 responses)

The graph in the image to the right of the relevant text is a bit ambiguous (what is represented? units??)

I believe it represents the audio level as a function of time. Since all sound is a physical phenomenon of oscillating waves of varying air pressure, then the representation of a sound can be drawn as a mathematical function across the time domain. The "units" are essentially arbitrary, but one can choose a range of discrete values with which to represent any sound (this Wikipedia page has a good overview).

The compact disc digital audio standard of signed 16-bit integer values gives 65536 distinct values, 32767 positive, 32767 negative, and one "zero" value (the flat horizontal line in the middle). The one missing value (65536 - (32767 * 2 + 1) = 1) represents an overflow/underflow condition (I believe -32768 for the CD audio standard). Considering if you use a 16-bit signed integer, say a C/C++ signed short, and you add 1 to the sound level variable whose value is already +32767 (max positive), then this will give -32768, which may manifest itself as a near-instantaneous oscillation to max negative.

Again, I'm discussing this merely from intuition and prior education/experience, but I don't mean to insinuate that this is what's happening in the image. I certainly don't mean to summarily dismiss others' theories. And thanks for sharing, BTW!

Does anyone sense that I enjoy pontificating on digital multimedia? ;)

Refining the Process of Digitizing Vinyl Records

Posted Dec 27, 2008 5:55 UTC (Sat) by boog (subscriber, #30882) [Link] (6 responses)

"I believe it represents the audio level as a function of time."

<grumble>

The point is the graph doesn't make that specific, when one would have hoped it would. Diagnosis of the problem would be helped by knowing what was being represented. (I don't see why audio level, which is usually logarithmic, would be symmetric about zero.)

It's unfortunately quite common for programs (and programmers) to make assumptions about units, i.e., not displaying them, which can only lead to trouble (Mars missions, anybody?) The author's omission of axis labels etc may be partly due to the program used.

</grumble>

Refining the Process of Digitizing Vinyl Records

Posted Dec 27, 2008 16:37 UTC (Sat) by pr1268 (guest, #24648) [Link] (5 responses)

(I don't see why audio level, which is usually logarithmic, would be symmetric about zero.)

It's symmetric about zero since all pure tones are sinusoidal. This includes overtones commonly found in music. Consider that a sine function of time is similarly symmetric about zero.

Refining the Process of Digitizing Vinyl Records

Posted Dec 27, 2008 17:09 UTC (Sat) by dlang (guest, #313) [Link] (4 responses)

thanks for the perfect example of the clarity problem

the parent of your post was talking about the signal level while you are talking about the waveform

Refining the Process of Digitizing Vinyl Records

Posted Dec 27, 2008 17:42 UTC (Sat) by pr1268 (guest, #24648) [Link] (3 responses)

The waveform is a representation of the signal level.

If I'm only adding to the clarity problem, then I'll gladly step aside.

Refining the Process of Digitizing Vinyl Records

Posted Dec 27, 2008 20:58 UTC (Sat) by boog (subscriber, #30882) [Link] (2 responses)

None of us is being very clear :-)

Through most of the electronics, the audio signal will be represented by a voltage, which will indeed be a sum of sine waves varying about zero (there is no DC component in audio signals). I _guess_ that is what the graph represents - voltage samples against time. Audio level is, I believe, something like the logarithm of the rms power or magnitude of the signal, relative to some reference value. The level is therefore not sinusoidal or symmetrical, except under very unusual circumstances.

Happy new year to all.

Refining the Process of Digitizing Vinyl Records

Posted Dec 30, 2008 15:26 UTC (Tue) by tialaramex (subscriber, #21167) [Link] (1 responses)

I guess I can help a little here...

The "graph" isn't a graph, it's a fragment of screenshot, probably from Audacity, a portable program for manipulating digital audio.

Inside Audacity there are no voltages, signals etc. just linear PCM data. PCM (Pulse Code Modulation, hope I'm not teaching my grandmother to suck eggs) converts a band-limited signal into a series of sample values, which can be used with appropriate hardware to recreate the same signal. Most software (and a lot of hardware) for processing audio uses PCM as a representation.

In this screenshot the PCM sample values are indicates as the blue dots. As others have pointed out, there's no need for a scale, the minimum and maximum are arbitrary unless a calibrated input is used, and Audacity doesn't know about calibration. If you prefer, imagine the upper limit is designated +1.0 and the lower -1.0 as it might be in a more professional program.

Audacity's developers are enthusiastic, but not always very technically capable. This zoomed in PCM display is an example. Those lines joining the dots are nonsense. PCM data isn't a continuous wave, it's just a lot of samples. So the blue dots shouldn't be joined by lines, they should either stand alone or at the end of independent vertical lines starting from the center zero (to represent impulses). However, if you must join them with lines, because it pleases you to think of the PCM data as a "waveform" then the lines must be drawn to fit the constraint of the waveform that would actually be generated from the PCM data, and that means a sinc function, not the stair-step function used in cheap-and-nasty Windows programs nor Audacity's even more silly "join the dots with a straight line" approach.

It's just conceivable that Audacity could have a bug which would cause this "off by one" problem, but it's much more likely to be a hardware bug or limitation in the equipment doing the PCM sampling. On the other hand, our esteemed Editor uses a fairly good sound card, a Delta 44. If I was trying to track down the problem I'd advise the editor to get down to the hardware ALSA device and use arecord to record the input when it's clipping, and see if the same problem is observed. If so, most likely the M-Audio device or its driver is at fault. Otherwise, the blame looks more likely to lie in userspace software.

Refining the Process of Digitizing Vinyl Records

Posted Jan 10, 2009 22:10 UTC (Sat) by magnus (subscriber, #34778) [Link]

"Audacity's developers are enthusiastic, but not always very technically capable. This zoomed in PCM display is an example. Those lines joining the dots are nonsense."

I think you are painting a very black-and-white picture here. There is no law dictating how data should be visualized.

You are absolutely right that, if you want to show the "true" data you should show only dots or impulses, and if you want to show what (ideally) the analog audio waveform will look like when the data is played, then you should upsample the data to the screen resolution (with a sinc function as you describe) and plot it as a continuous wave.

However, a very common way of drawing time series data in general is to draw it as points connected with straight lines. Even someone not very technical looking at such a plot quickly understands that the data is sampled and that the lines are "cosmetic". An upsampled plot might give the impression that there is more data than there actually is.

So, I would wait a little longer before calling the Audacity developers incompetent. Anyways, they probably have more urgent features to develop than making microscopic zoom levels look better.

Best regards,
Magnus

Refining the Process of Digitizing Vinyl Records

Posted Jan 10, 2009 21:21 UTC (Sat) by magnus (subscriber, #34778) [Link] (1 responses)

Do you have a reference for the forbidden -32768 value on CD:s? First time ever that I heard about this, very interesting..

Refining the Process of Digitizing Vinyl Records

Posted Jan 11, 2009 1:20 UTC (Sun) by pr1268 (guest, #24648) [Link]

I thought I remember reading it somewhere in a digital multimedia textbook, say perhaps Ken Pohlmann's Principles of Digital Audio. But I may be mistaken—feel free to correct me if you know better.

I do know that the CDDA standard used in audio CDs is a signed 16-bit resolution, which equates to a short integral type used in many programming languages, with a value range of -32768 to +32767.

I'm unsure that the value -32768 is "forbidden"—it merely represents a sound level signal overload. While recording engineers may have been loath to approach "maxxing out" the audio signal back in the early days of the music CD, it happens often nowadays as most CDs are mastered as loud as possible.

gnome wave cleaner

Posted Dec 25, 2008 9:40 UTC (Thu) by ekonijn (subscriber, #6395) [Link]

Perhaps also have a look at gnome_wave_cleaner: http://gwc.sourceforge.net/, available in Debian as gwc.

It's designed specifically to clean up digitized vinyl. That means the important operations are not hidden in a menu but directly accessible from a tooolbar, and there is a sonagram view that makes it easy to find the exact place of those clicks that escape automatic cleanup.

Refining the Process of Digitizing Vinyl Records

Posted Dec 25, 2008 14:51 UTC (Thu) by phip (guest, #1715) [Link] (1 responses)

Does anyone have any good pointers to similar articles about digitizing audio from cassette tapes?

Thanks in advance,
Phil

Digitizing cassette tapes

Posted Dec 26, 2008 2:31 UTC (Fri) by pr1268 (guest, #24648) [Link]

While I don't have a URL for you, I can suggest the following basic technique:

Use a Sony Walkman™-style player (this one is still made and sold), a wire with a 1/8" inch (3.2 mm) stereo mini plug on both ends, and the stereo line input of a sound card (most sound cards, including on-board cards, have these). Similar to what Forrest's article mentions, grabbing the whole side of a tape at a time is the recommended way, and you can split up individual tracks after processing the audio.

Most audio editing applications have tools like noise/hiss reduction and speed alteration, so you can use the software to correct anomalies produced by the playback chain (or those due to using cheap equipment).

While my earlier post mentioned the Windows-only Goldwave software, I do know that Audacity provides the editing tools needed to do this (and it's GPL and included with many Linux distros). Lame, BladeEnc, and oggenc(1) are other tools I use for audio format conversions.

This is how I've done it now for 9 years now (including 6+ years in Linux). Forgive me for bragging, but I've created CDs from tapes for clients who later claimed that the audio quality was "amazing". And I use the $40 USD Walkman linked above.

The Outsourced Alternative

Posted Jan 2, 2009 20:03 UTC (Fri) by craigrmeyer (guest, #55899) [Link] (1 responses)

I just want to point out, folks, that there's a top-level alternative to all this tinkering around with software. One can instead just send his records to an outside digitizing service, like namely my own: Reclaim Media.

Not only do we apply the incredibly-effective Izotope RX pop-filtering algorithm to each and every record we do, but we also physically clean our records as well right before we play them.

Physical cleaning is really important. Most of the hiss and crackle that you hear from a record is from dirt and dust in the grooves. These cleaning machines cost a bit of money, though, so they can only be easily justified in an large-scale "industrial" operation like ours where we do up to hundreds of records a day.

One more thing: Unless your free time is worth less than minimum wage, we're often a better deal than digitizing at home anyway, though of course this depends on how quickly and efficiently you do it. I've written an article "Why Not Digitize My Cassettes & Records at Home?" about exactly this. Before spending much more time or money on digitizing your records at home, give it a look and decide whether its arguments apply to you.

If so, then you really owe it to yourself to give us a try!

Thank you,
Craig Meyer
Founder and Engineer, ReclaimMedia.com

The Outsourced Alternative

Posted Jan 3, 2009 1:12 UTC (Sat) by deunan_knute (guest, #290) [Link]

Craig- Given that this is solely a technical forum, i don't think that your post is appropriate. It would be different however if you had some technical insight (as someone who obviously deals with these issues, given your occupation) to add to the conversation.

I'm not trying to be rude, and i realize you were not trolling, but one reason i and others love LWN is the high signal to noise ratio.

happy holidays all the same!


Copyright © 2008, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds