Refining the Process of Digitizing Vinyl Records
In October, your author discussed the process of digitizing vinyl records for the creation of a digital audio library. Since that time, the process has been performed on around 40 disks and a number of refinements have been made. This article discusses what has been learned in that time.
One part of the digitizing process that has proven to work well involved treating one side of the original media as a single chunk of data. Many of the processing steps can be performed on these large data chunks before splitting up the individual tracks.
After making numerous recordings, it was discovered that a single record level, 93 on the inputs of the M-Audio Delta 44, consistently produced recordings with a useful volume range on the majority of the records that were copied. An interesting phenomenon was observed with some recordings that were recorded with too much gain. On loud passages, as the waveform reached the upper or lower limit (rails in electronic-speak), instead of just flattening out, a complete inversion of the wave would occur, resulting in harsh sounding rail-to-rail glitches. The source of the problem is open to speculation. If this should occur, it is best to make a new recording of the album side with a lower input level.
Having two machines handy has helped to optimize the audio processing work. One machine is dedicated to making the initial album side recordings. The sides are minimized in size by removing data before and after the recorded audio starts, and fade-ins and fade-outs are added to whole album side. The album sides are copied to another machine with a faster processor for further processing. The original copy is kept around as a backup until the side has been fully processed. After copying the recorded album side to the secondary machine, a new recording can be started on the recording machine.
The process of removing clicks and scratches from an album side has seen the most changes since the original article. This is a bit of a learned art. The first step now involves visually inspecting the waveform of the album side with Audacity. Often a few huge spikes will be visible on the recording. They can be removed by repeatedly selecting an area and zooming in until the zoom resolution shows individual samples as dots. The repair operation should be performed on all of the large clicks. Smaller clicks can often be found and removed by zooming into the quiet passages, an almost infinite amount of of hunting, zooming and repairing can be done.
Another good way to find clicks is to listen, pause, remove and move on. Most tracks can be cleaned up to a reasonable level without too much effort. Some albums can contain an incredible number of clicks while others can be nearly click-free. After the manual deglitching is done, the automated click removal step can be performed. This is now optional, but it can find additional clicks that are buried in busy waveforms.
After whatever amount of declicking seems reasonable, the audio is exported from Audacity as a .wav file. Before exiting Audacity, the Stereonorm script (available here) is run on the .wav file to bring the left and right channel levels up to 100% volume. If the normalization results look reasonable compared to the Audacity visual representation of the recording, Audacity is exited and restarted with the normalized recording. If the normalization numbers seem right compared to the visual wave representation, it is often possible to remove more offending large clicks, export again and rerun the normalization step. Although it may make audiophiles cringe, it may be beneficial to use the repair function to shave the level off on the peaks of loud percussive waveforms. Done sparingly, this can be used to fix balance problems encountered during the normaliztion step.
The version of Audacity that your author has been using, 1.3.4-beta on Ubuntu 8.04, has a few bugs that can cause crashes and the loss of time-consuming work. Occasionally after doing a lot of repairs, attempting to export a file as .wav produces a long stream of zero-length write errors. It is usually possible to recover from this by writing out the data in the Audacity native .aup format, exiting and restarting Audacity with the .aup file, and trying the .wav export again. On numerous occasions, adding a label track followed by doing more click repairs has caused Audacity to crash. It is advisable to perform the labeling step on a new instantiation of Audacity. Hopefully these bugs to disappear when the system gets updated to a newer version of Audacity.
After investing many hours into the creation of a large audio library (now up to around 200GB), it becomes critical to back up the data. Fortunately, the price of IDE disks has dropped as fast as the capacity has risen and hard drives can be treated as high capacity data cartridges. Backups can easily be done by adding a temporary SATA or USB drive to a system and running an efficient rsync operation to copy any new or changed data to the offline archive.
