Andrew Rondeau

Software Developer, Audio Enthusiast

Audio

Audio-related topics

I wrote a tool to investigate degradation of audio in lossy compression, like mp3 and aac. With this tool, I demonstrated that high bitrate lossy audio is not sufficient for audiophile-quality recordings; and that a market for lossless recordings is justified.

For the past few years I've been interested in how good, or bad, lossy audio compression really is. When I occasionally listen to lossless versions of my music, I suspect that they sound better; yet the placebo effect means that I don't trust my own ears.

There is evidence that, in practice, a well-trained ear can tell the difference between lossless and lossy audio. When I ripped my DVD-Audio disks, (in surround sound,) I read though the EBU Evaluations of Multichannel Audio Codecs. The conclusion that I took from the article is that, in the correct conditions, an audiophile will hear the difference between lossy and lossless versions of a recording.

I do find anecdotal evidence that "so and so" ran "such a such" a test and it concluded that no one can tell the difference. I do not trust these anecdotes as audio perception is highly subjective; the test methods must also be scrutinized.

It is very difficult, if not impossible, to find more objective data that compares audio codecs. The kinds of tests performed in the EBU's study require lots of quality equipment, people, and time. A quick and objective way to compare a compressed file with the original is also needed. My program allows for objective comparisons, although the data requires interpretation.

I recently became interested in this topic again, as the Tidal Music Streaming Service offers lossless. Most of my music is managed with iTunes match, where it is not practical to use lossless.

It's also important to understand that what's commonly called "lossless" still involves perceptual manipulation in order to save space. Specifically, noise shaping, used to decimate a 24-bit recording master to 16-bit, involves perceptual manipulations. "Its purpose is to increase the apparent signal-to-noise ratio of the resultant signal." As a consequence, a 16-bit lossless flac file, noise shaped from a 24-bit master, will have some high-frequency hiss below the absolute threshold of hearing.

The Experiment

To evaluate compression, I decided to rip three different songs from different DVD-Audio disks:

I chose two modern recordings to ensure that the full capability and frequency range of 24-bit, high sampling rate, audio, was utilized.

All rips were the stereo version of the songs, and were converted to 48 khz, floating point wav files. I called the 48khz, floating point version my "original." To convert, I used sox:

sox (24-bit from dvda.wav) -e floating-point original.wav  rate -v 48k

I chose 48khz because lossy codecs, and streaming services, don't support higher sampling rates. This is somewhat of a shame, because "Under ideal laboratory conditions, humans can hear sound as ... high as 28 kHz." Hopefully sometime I can explore this topic further.

(One curious thing that I learned is that "Don't Let it Show" has no material over 22.05 hz. This really surprised me, as the disk made a very big deal about having a 96 khz version for DVD-Video, and 192 khz for DVD-Audio.)

I created many different compressed versions of each song:

  1. 16-bit noise-shaped flac, via sox
  2. 320 kbps aac, via iTunes
  3. 320 kbps mp3, via xAct Audio Copy
  4. 320 kbps opus, via xAct Audio Copy

I also made some experimental "lossless" decimated versions of Don't Go: 1. 8-bit noise-shaped flac, via sox 2. 8-bit noise-shaped flac, with equalization to simulate emphasis and noise reduction, via sox and Audacity 3. 12-bit noise-shaped flac, via sox

The Program

The source code to my comparison program is available on Github, see MeasureDegredation.

My program expects that both the original and compressed files are 32-bit floating point wav files, 48 khz. This dramatically simplified my implementation, although I did have to rely on external tools, such as sox and Audacity, to convert the compressed wav file to floating point.

I chose floating point because it avoids quantization error and biases that originate from noise shaping. As I demonstrate in this article, decimating to 16-bit can be considered "lossy," depending on how many hairs one wants to split. Thus, it's critical that, for an accurate comparison, lossy files are decoded to floating bit.

The program outputs an excel spreadsheet for further analysis. I tried generating graphs in my program, but I could not. Thus, I had to later manually add the graphs.

The Data

The program performs three different kind of comparisons:

  • Sample-by-Sample: Effective for examining how directly lossless the compression is
  • Sample-by-Sample in frequency bands: Allows a closer look at lossless compression by breaking the signal into three frequency bands
  • All Frequencies: Compares all frequencies to discover equalization changes, and accuracies, at different frequencies.

The Sample-by-Sample comparison is mainly useful for examining how drastically noise-shaping manipulates the original signal; and how close a lossless audio file is to the original. It shows what the average bits per sample is, if compression were merely rounding the source floating-point version, and what the worst bits per sample is.

The Sample-by-Sample comparison is also run again with the signal broken into three frequency components. This allows seeing how noise shaping preserves the original frequency bands in a lossless manner.

The All Frequencies comparison is the most useful, as it shows how either equalization is preserved in a lossless file, or dramatically changed in a lossy file. It also attempts to calculate a "bits per sample" for each of the frequency components that make up a recording. It is based on the theory that the ear only hears frequencies' amplitudes; and not phase. Thus, this is the most objective way that I can evaluate audio compression techniques purely from data.

Someone who understands audio perception better than I do might find fault in my comparison technique. Keep in mind that the goal is to have an objective comparison that does not rely on listening tests. It's always possible to improve the program used to compare, or to re-run tests with improved encoder settings.

What I did not do in the All Frequencies comparison is measure decibels, or decibels of separation. Such an approach would be even closer to measuring how the ear perceives differences in a compressed file.

Ideally, a comparison that results in one number, or a small set of numbers, describing how close the compressed audio file is to the original, is desirable. I did not create this number, though.

Comparing Each Compression Technique

16-bit noise-shaped flac

Note: It is important to remember that decimating from a 24-bit master to 16-bit is a perceptual compression technique. It's considered lossless because it closely matches the fundamental limitations of human hearing. (Although, there is room for a healthy debate.)

All three songs were converted to 16-bit flac with the following command:

sox original.wav -b 16 16bit.flac dither -s -p 16

After converting, the worst error in all files was approximately 29. This means that, compared to rounding to a 16-bit integer, a noise shaped flac can be approximately ± 29. At its worst, a noise shaped flac is about 11.1 bits per sample. The average error, compared to rounding, is ± 4.7. This gives an average bits per sample of 13.7.

The numbers presented, though, are misleading, as noise shaping is perceptual and is designed to better preserve frequencies where the ear is more sensitive, at the expense of less accuracy where the ear is less sensitive. When looking at different frequency bands:

  • High Frequencies: > 6khz: Worst bits per sample is 15.25, average is 19.5
  • Mid Frequencies: < 6khz, > 3khz: Worst bits per sample is about 17, average is 20.1
  • Low Frequencies: < 3khz: Worst bits per sample is about 16.8, average is 19.5

In all three bands, the equalization was unchanged. Just by breaking a 16-bit, noise-shaped, quantized signal into bands, it more closely resembles the original. This is an important point to understand, as the general belief is that no DAC has a signal to noise ratio greater than 21 bit. This means that noise-shaping to 16 bit is very close to the maximum theoretical audio quality achievable on modern equipment.

When looking at each frequency (without looking at phase,) 16-bit noise shaping becomes easier to understand.

Don't Go: Don't Go: 16-bit, noise shaped FLAC, bits per sample, per frequency Don't Go: 16-bit, noise shaped FLAC, bits per sample, per frequency Don't Go: 16-bit, noise shaped FLAC, frequency response Don't Go: 16-bit, noise shaped FLAC, frequency response


Don't Let It Show: Don't Let It Show: 16-bit, noise shaped FLAC, bits per sample, per frequency Don't Let It Show: 16-bit, noise shaped FLAC, bits per sample, per frequency Don't Let It Show: 16-bit, noise shaped FLAC, frequency response Don't Let It Show: 16-bit, noise shaped FLAC, frequency response


Flight Test: Flight Test: 16-bit, noise shaped FLAC, bits per sample, per frequency Flight Test: 16-bit, noise shaped FLAC, bits per sample, per frequency Flight Test: 16-bit, noise shaped FLAC, frequency response Flight Test: 16-bit, noise shaped FLAC, frequency response


All three frequency response graphs have a sparkle in the highest frequencies where the ear is less sensitive. They are essentially flat until about 15khz, and as they get over 20 khz, they are ± 3%. The graph for Don't Let it Show is harder to read, because it shows 0% for frequencies over 22.05 hz. The other graphs are zoomed in at the top percents.

This data makes the term lossless somewhat ambiguous: a 16-bit reduction at 48khz, or 41khz, is generally agreed to be very close to the fundamental limitations of human hearing, but are the changes shown here lossless? Does lossless mean a almost flat frequency response, 18-20 bit resolution where the ear is most sensitive, and 11-13 bit resolution where the ear is less sensitive?

Certainly, the term "mathematically lossless" is well defined, as Flac does not make any changes to an integer (16 bit or 24 bit) audio signal. But, where do we draw the line between lossless and lossy when changes are made, such as when decimating happens when downsampling from 24-bit to 16 bit? How many hairs need to be split to differentiate between lossy and lossless?

In all three cases, a 16-bit lossless Flac was 3x the size of a 320 kbps mp3, aac, or opus file.

320 kbps mp3

Mp3 is the oldest perceptual lossy codec that I investigated. Its widespread popularity necessitates investigation; even though it is considered obsolete. In the approximately 20 years that we've used mp3, the encoders improved considerably.

To generate my mp3s, I used xAct Audio copy. I selected ABR at 320 kbps.

When comparing mp3 on a sample - by - sample basis, predictably, the biggest differences were huge. The "worst error" was off by over 40,000 (16-bit,) implying at lossless, a worst-base scenario of less than a bit a sample. On average, mp3 was about 3-4 bits a sample, sounding worse than a cassette tape. Even when comparing at frequency bands, the error was still similar.

Don't Go: Don't Go: Bits per sample per frequency Don't Go: Bits per sample per frequency Don't Go: Frequency Response Don't Go: Frequency Response


Don't Let It Show Don't Let It Show: Bits per sample per frequency Don't Let It Show: Bits per sample per frequency Don't Let It Show: Frequency Response Don't Let It Show: Frequency Response


Flight Test Flight Test: Bits per sample per frequency Flight Test: Bits per sample per frequency Flight Test: Frequency Response Flight Test: Frequency Response


Looking at the data, the low-frequency accuracy is just atrocious at high bitrates. The error is so bad at times that low frequencies have negative bits per sample. The equalization curve is also anything but flat.

Based on this data, I would never use mp3 for any lossy audio, unless hardware limitations forced me to use it. It is completely unacceptable to use for any streaming or pay-to-download service. As a result, I will never pay for anything that uses mp3.

It is time for mp3 to go the way of the 78 shellac record and the 8-track.

320 kbps aac

Mp3's authors created aac to continue their work and make improvements. It is the natural successor to mp3, and is what iTunes uses.

I encoded all of these files in iTunes. Unlike the other files in this test, these files were encoded a few years ago when I originally bought the DVD-Audio disks. I directly imported the high resolution (and high sampling rate) wav file into iTunes, and relied on iTunes to perform all downsampling to 48khz.

Surprisingly, when looking at sample-by-sample and banded, aac averages over 9 bits per sample for Don't Go. For the other files, the error is much worse.

Don't Go: Don't Go: Bits per sample per frequency Don't Go: Bits per sample per frequency Don't Go: frequency response Don't Go: frequency response


Don't Let it Show: Don't Let It Show: Bits per sample per frequency Don't Let It Show: Bits per sample per frequency Don't Let It Show: frequency response Don't Let It Show: frequency response


Flight Test Flight Test: Bits per sample per frequency Flight Test: Bits per sample per frequency Flight Test: Frequency response Flight Test: Frequency response


For Don't Go and Don't Let it Show, aac had a somewhat flat frequency response, only really showing changes in the ultra-high frequencies that people hear poorly. The bits per sample was low in low frequencies, but steadily climbed.

Surprisingly, Flight Test showed problems. Low frequencies had negative bits per second, and the equalization favored bass too much.

I've happily listened to aac for about a decade, but usually in the car or in office settings. After seeing this data, it's clear that high bitrate aac degrades a file too much for high quality listening.

320 kbps opus

To encode Opus, I used xAct Audio Copy. I chose VBR constrained with a 60ns window. A short window setting allows for Opus to work in low-latency applications; in this case, I was more concerned about optimizing for audio quality instead of telephony.

One of the most impressive things about opus is that it results in a a file that's, on average, about equivalent to a rounded-to-8-bit lossless audio file, both when comparing by samples and when looking at frequency bands. This is very impressive for a "lossy" audio format.

Anecdotally, Opus appeared more CPU intense than the other codecs. I did not measure CPU or compression time in any objective manner, though.

Don't Go: Don't Go: Bits per sample per frequency Don't Go: Bits per sample per frequency Don't Go: Frequency Response Don't Go: Frequency Response


Don't Let It Show: Don't Let it Show: Bits per sample per frequency Don't Let it Show: Bits per sample per frequency Don't Let it Show: Frequency response Don't Let it Show: Frequency response


Flight Test: Flight Test: Bits per sample per frequency Flight Test: Bits per sample per frequency Flight Test: Frequency response Flight Test: Frequency response


When looking at each frequency, opus has a much more impressive bits per sample per frequency. Compared to mp3 and aac, it accurately can represent all frequencies. Frequency response is generally flat, with some dips where the ear is least sensitive.

Objectively, Opus is the most accurate lossy format. When comparing file size, Opus is 2/3rds the size of an 8-bit flac, but without audible hiss or artifacts. Opus's accuracy also emphasizes that the term "lossless" is quite subjective. It has a very flat frequency response, and when comparing on a sample-by-sample basis, comes very close to 8 bits per sample.

Noise-shaping to lower bits per sample Flac

I also decided to experiment slightly with converting Don't Go to different forms of an 8-bit and 12-bit Flac. The 8-bit flac was about half the size of the 16-bit Flac, and only about 1/3rd larger than the 320 kbps lossy file. The 12-bit Flac was double the size of a 320 kbps lossy file.

(I also generated a rounded-to-8-bit file, but it sounded too poorly to be considered.)

Ordinary Noise Shaping to 8-bit

My first experiment was to use ordinary noise shaping:

sox original.wav -b 8 8bitnoiseshaped.flac dither -s -p 8

This resulted in a hissy sounding file. This hiss was about what a cassette with Dolby B sounds like, although at a higher frequency.

8-bit, noise shaped, bits per frequency 8-bit, noise shaped, bits per frequency 8-bit, noise shaped, frequency response 8-bit, noise shaped, frequency response

Some Equalization with noise-shaping at 8-bit

The audio quality, while hissy, was somewhat promising. What if I could apply some analog-style noise reduction? I couldn't find any easy-accessible ways to digitally-emulate a noise reduction circuit, so I equalized the original wave file to have very high treble, and then converted it to an 8-bit Flac. On "playback," I equalized using the inverse.

An interesting side-effect happened. Equalization resulted in lots of clipping; the clipping then resulted in a "pumping" sound in the playback side. I fixed this with normalizing after equalization, both before converting to Flac and after converting from Flac.

Upon applying the inverse equalization, most hiss remained. This could probably be fixed with some compression / expansion, but I will admit that I don't understand these filters well enough to use them for this purpose.

Equalization in Audacity Equalization in Audacity

8-bit bits per frequency with equalization and noise reduction 8-bit bits per frequency with equalization and noise reduction 8-bit frequency response with equalization and noise reduction 8-bit frequency response with equalization and noise reduction

Looking at the data, noise shaping to 8-bit is promising. I suspect that, when pairing with a digital emulation of analog-style noise reduction, an 8-bits-per-sample approach can yield a pleasant-sounding result.

Noise-shaping to 12-bit

Noise-shaping at 12-bit yielded a very accurate file. On a sample-by-sample basis, it was 9.7 bits per sample. When broken into frequency bands, it was 16.1 bits per sample between 3 and 6khz, and 15.5 bits per sample otherwise.

When I listened to the file, I could not tell any differences. My listening tests, though, are unscientific. I did not listen at a very loud volume, and I suspect that some hiss is present at loud volumes.

When comparing on a frequency-by-frequency basis, equalization was flat until the high frequencies where the ear is less sensitive.

12-bit, noise shaped, bits per frequency 12-bit, noise shaped, bits per frequency 12-bit, noise shaped, frequency response 12-bit, noise shaped, frequency response

One thing to point out is that 16-bit audio is generally considered 96 db of separation without noise shaping. This is below the 120db threshold of pain. This means that someone might be able to tell the difference between 12-bit noise shaped and 16-bit noise shaped when listening at loud volumes. I do wonder how 12-bit noise shaped compares to 16-bit rounded. Likewise, could analog-style noise reduction techniques allow for a 120db signal to noise ratio with noise-shaped 12-bit audio?

In Summary

In Summary, whenever audio is presented to discerning listeners, it's critical that a lossless version is available. Appreciating lossless audio is not a placebo effect, but instead is something that is measurable with real data.

Codecs like mp3 and aac are too outdated for modern use. Their equalization changes are too drastic, even for discerning listeners.

Opus is the most promising codec when bandwidth is limited. Its equalization is the flattest of all three lossy codecs, although there are noticeable dips in the graph. When writing a new audio application, using Opus when bandwidth is limited may result in the most pleasing results.

One interesting area to investigate is to see if there is a way to have a slightly smaller file than Flac, but still preserve audio quality. Can older analog-style noise reduction techniques, combined with a noise-shaped 8-bit, or 12-bit, signal, preserve an accurate signal? Is such a technique considered lossy or lossless? For example, can a file that's 2/3rds the size of a 16-bit Flac still be as close to the fundamental limits of human hearing as mathematically-lossless 16-bit audio?

Another point is that the terms lossy and lossless are very subjective. Yes, Flac can preserve a 16-bit, or 24-bit signal without loss, but is decimating 24-bit master to 16-bit lossy? Is 8-bit lossy? Is a lossy codec that results in a signal about as accurate as decimating to 8-bit lossy?

An area of frustration is the amount of software and APIs that I encounter that are locked at 16-bit. Decimating audio to 16-bit requires some perceptual conversions, and these conversions have consequence. This can become apparent when manipulating noise-shaped audio, and then re-noise-shaping it. Modern hardware does not need to be 16-bit; instead, we need to view 16-bit as a compression technique. Software developers should not have to deal with tradeoffs involved in decimating to 16-bit. Audio APIs really need to be 32-bit float; and the audio hardware will need to choose algorithms that convert this to analog at an accuracy unambiguously beyond the human range of hearing.

Thus, the final conclusion is that the term "lossless," (and "cd quality,") must be carefully defined, otherwise, it will be abused as a marketing term. "Lossless" can't become like the term "natural," "farm to fork," or "wholesome," where the retailer can redefine the description to match the product. The way to avoid confusion is to instead require terms like:

  • 16-bit, Mathematically lossless: Basically, Flac from a CD
  • Lossless at 120db: Completely lossless with a snr of 120db. The signal might be slightly different than mathematically lossless, though.
  • Lossless at 96db: Completely lossless with a snr of 96db. The signal might be slightly different than mathematically lossless, though.

Before I start, I would like to refer the reader to a document on mastering vinyl records at the following URL: http://www.urpressing.com/tips.html. I make comments about records that can be understood by reading the document.

I have a love-hate relationship with vinyl. Sometimes I marvel at how real the 50-year-old technology sounds compared to modern CDs and MP3s, and other times I curse the dust that pulls my attention away from the music. Ever since the early 80s when CDs were a novelty for the rich; golden-eared audiophiles, nostalgic crudgemugeons, and naive music fans alike have preached about the realistic nature of the vinyl record compared to digital audio. In this writing, argue that it’s time to accept, in terms of sonic quality, that modern digital technology has surpassed all real and imagined advantages of the vinyl record, and to provide an argument as to why we should keep the antiquated practice of publishing treasured modern recordings on vinyl.

The truth is that a well-mastered vinyl record has the potential to sound better then a CD, due to the limitations of digital audio in the 80s. When used with a good needle, records have a slightly higher frequency range, about 25 kHz, then audio CD, which is locked at about 21 kHz, no matter how expensive the player is. (We can thank our good friend Mr. Nyquist for this.) While I do not know of any recordings that have notes anywhere near such high frequencies, overtones and harmonics in that range lend to the “realism” and “wow-ness” that us golden-ears are addicted to.

Remember how I just said that a well-mastered record has the potential to sound better then a CD? (This is where you should go skim the above-linked document if you don’t believe me.) Well, die-hard vinyl fans never seem to remember the technical limitations of this confused and misunderstood format. On many of my disks I’ve noticed two main annoying characteristics that leave me wishing I spent the money for the CD. S’es tend to get distorted, and there is a serious lacking of the high-end in the inner groves. Unfortunately, no amount of studio trickery can fix this phenomenon, except for the possible re-ordering of program material. CDs may not sound as good as the outer-part of a record, but at least they are consistent in their sound quality from beginning to end!

One of the biggest selling factors of CD, at least to the cotton-eared-masses, is that they always sound good. Records scratch easily, and unlike CD, where there’s extra information stored to help the CD player recover from minor scratches, you’re always going to hear it with a record, for the rest of its life. Don’t get me started on dust! I just can’t seem to get it off of many of my used records, but I can wipe it off of my CDs with my T-shirt.

Oh yeah, another reason why CDs are so popular is their portability. You can’t play records in the car! There were some feeble attempts in the 60s, but they only supported proprietary formats and 45s. Apparently they had to put so much pressure on the needle to keep it from falling out of the groove that it would wear the disk out prematurely. Personally, I’m very happy ripping my favorite CDs and keeping a bundle of high bit rate MP3-CDs in my car.

I won’t get into how the curved movement of the arm causes minute timing and pitch distortions; and causes the needle to not be at the perfect 0-degree angle to the groove. My turntable has a linear tone-arm. So there!

I’ll admit that I’ve heard many records that blow their CD versions away. Early Green Day on Lookout Records is one shining example. Whatever ADAC they used to master their CDs simply did not pick up any bass. While theory states that digital audio should always have better bass, the records provide the full sound that their style calls for. This phenomenon is not unique and caused many metal fans to continue to use their phonographs for years after CD came along, and rightfully so. I wouldn’t want to listen to Black Sabbath or Metallica without base; it’s a sin!

Likewise, I once listened to a well-produced pressing of Ride the Lightning by Metallica on record. The cymbals sounded so real and clear, unlike anything I had heard before on CD. With some records just beating the pants off of their CD counterparts, even I want to hold onto them until they are released in a high-fi format such as DVD-Audio or SACD.

The mention of DVD-Audio and SACD brings me to my first main goal of this document, which is to declare records officially obsolete and sonically surpassed by modern technology. I’ve heard so many different naive audiophiles make the mistaken claim that “Digital will never surpass vinyl because it records everything between the samples; it’s a perfect analogue.” Don’t forget that vinyl is limited by the laws of physics and mechanics, just like everything else. The audiophiles are wrong in three ways, as listed below:

  1. Vinyl does a decent job at carrying two channels with proper mixing, but as the format war in the 1970s over quadraphonic audio on LP demonstrated, it doesn’t carry much more. Many people, including myself, find that music in surround is much more natural and real then traditional stereo. Digital, on the other hand, can discretely carry as many channels as possible. (I’ve heard all the arguments against surround-sound and will only offer one counter-argument. Listen to a good concert, and try to recreate the experience with traditional stereo. You can’t.)

  2. During a school project investigating ski-base wear, I learned that all material surfaces, no matter how smooth, are rough and random at some scale. This point is where vinyl, no matter how good of a manufacturing process is used, cannot hold a high frequency or soft note. I do not know if anyone has performed any research into determining where this point is on vinyl. How can vinyl record “everything between the samples” if even it has a limited resolution?

  3. The size and shape of the cutting lathe causes sounds to be clipped off, although they may conceivably be written onto a record, Even if additional sound “between the samples” makes it onto the record, it’s too small to be picked up by the needle and will never make it out of the speakers.

DVD-Audio and SACD have a sampling rate that is so high that it surpasses even the most sensitive of records and needles. Face it, a system that can reproduce a tone at 48kh without any special equipment will beat any system limited to 25 khz in any double-blind test. As the demand for surround grows, I strongly doubt that records will be able to keep up, even if paired with modern matrixing techniques.

I think the final nail in the coffin for vinyl should be the mixing limitations that it imposes on artists. Telling an artist to pick certain pans, phases, and levels because “it sounds better on vinyl” is just limiting creativity. Nostalgia aside, let’s move on, folks!

Yet, I’m still drawn to vinyl. I occasionally opt to buy the record instead of the CD for new recordings. How can this be? I think the answer lies in the reason why vinyl is the only audio format that is called a record. (Pun intended.) What is the best format for us to preserve our audio records, for generations and generations to come? How are we going to ensure that archeologists thousands of years into the future will hears the music of the 20th and 21st centuries?

The answer is audio records. Not CDs, not tape, nor DVD-Audio, SACD, hard drives, or any other high-fidelity format that we can dream of. A record is the best medium for leaving a record (pun) of sound, for the following reasons. Many master analogue tapes are rendered unusable after 20 years. It’s tradition to bake tapes in a feeble attempt to get the sound off of them. Even if the tape does survive, they’ll probably demagnetize by the time archeologists get to them.

I wonder how long it’ll take a future archeologist to find and read the pits on a CD, let alone a DVD-Audio disk or SACD. It’ll probably take a few years to understand the coding system used on CDs, and even more to decrypt a DVD-Audio disk. Who knows how hard it’ll be to understand SACD.

The reason why vinyl is so important is that it’s easy to figure out how to play. A future archeologist will stick the disk under the microscope and see the wiggles in the grooves. He’ll realize that he’s looking at sound rather quickly. The archeologist might not play it at first with the correct equalization levels or at the correct speed, but at least he’ll hear something.

Thus, in conclusion, vinyl is dead for both consumers and audiophiles alike. Digital technology has finally surpassed records on all fronts, including audio quality. It’s important that we keep producing our most beloved recordings on record so that future archeologists will be able to easily play them back and understand 20th and 21st century music.


Addendum

This article stayed on my laptop for about a month after I wrote the first draft. During this time, I would occasionally think about how the LP survived as an active consumer audio reproduction mechanism for about fifty years. What strikes me the most about vinyl is that it is much more flexible then audio CD. Red Book CD is locked into 2-channel stereo at 16 bits per sample, 44,100 times per second. There have been some attempts to get around this, but they all have various levels of compatibility and do not work with existing CD players.

What options are there to extend the CD format? The ‘net has rumors about a quadraphonic CD demonstration in the 1980s, but the format had half the playing time and did not work in a typical CD player. There’s some additional data space that can be used during playback, which I believe is used by (some hifi CD format,) but it never caught on. CDs encoded with DTS are available, but they will not play in non DTS equipment. One could create a player with large buffers that uses data beyond the playback tracks to enhance resolution, but such a player would be too complex for mass marketing and suffer from playback time reduction.

Vinyl, on the other hand, is quite flexible. It can be adapted to compete with modern digital formats, although with compromises. Need a better dynamic range? Leave more space between the grooves and record louder. Need a better frequency response in the center? Spin at 45 RPM. All of the techniques used to make records sound better will work with any existing player, but suffer from shortened playback.

These “drawbacks” don’t seem to hurt a blossoming vinyl segment: DJs, both at raves and clubs. Shortened playback is not a problem for them, because they usually only buy a record for a 5-minute song and a few remixes. What’s even more interesting is that DJs aren’t drawn to vinyl because of audio quality; they pick the format because of the user interface!

For club DJs, where the goal is to have a continuous and seamless set of music for the length of the performance, “play,” “pause,” and “stop,” do not give them enough control over their playback medium. DJs need to be able to start playback at precise instants so that the beat between two songs matches perfectly during transitions. They also like to read records so that they can judge them for the loudness. Current digital technologies simply do not provide DJs with a good replacement for such an interface. Maybe they will at a later date, or maybe, in the future, club-goers will just wonder what those big, black, shiny things are.

DIY (Do it yourself) Speakers

- Posted in Audio by

Note: These pages are quite old, but I prefer to leave them as-is. Their style is quite similar to how the web looked in the early 2000s.

Andy Rondeau's Do-It-Yourself (DIY) Speaker Project

My speakers right after I completed them in 2003

I currently own a set of self-built DIY Voight Pipes. There are four traditional pipes used for both my front and surround speakers, and a modified pipe for my center channel. All five speakers use a single Radio-Shack 8" 35-Watt full-range driver. (40-1044) What follows are two pages, one describing my the construction and use of the five speakers, and the other describing my quest to magnetically shield the drivers so that they won't interfere with my television and cause data loss.

One late night I was looking for reviews of DVD-Audio disks and came across an article which claimed that the goal of DVD-Audio is to fight MP3s. Here is my response. I didn't know that the site was going to publish it, and wish that I spent a few more minutes proof-reading.

It appears that the original server is no longer online. You can view it on the Internet Archive or my own saved version.


Hi. I just came across an older article about DVD-audio:

http://www.mp3newswire.net/stories/2000/dvd.html

Robert Menta portrays DVD-Audio as a means for the record company to squash Mp3. Are you sure that he knows what he's writing about? Ever since the 1960s, there has been a clamor for multi-channel sound. (Do a little research into the history of quadraphonics.) Ever since the original audio CD came out, there has been a clamor for higher sampling rates and higher bit densities. DVD-Audio is primarily aimed at audiophiles. (They are the people who can hear the difference between an mp3 and the original CD.)

As far as a ripper for DVD-audio? That's a whole different can of worms, so to speak.

Dolby Digital and DTS are the lossy formats of choice for a lower-bandwidth version of DVD-audio. (They are used to compress the soundtrack on a DVD movie. Both support 6-channel audio at 48khz/20bit. DVD-Audio is typically 6-channel, 96khz/24bit) DTS fits into a CD, that is, 1 minute of DVD-Audio quality sound, compressed with DTS, will take up the same space as 1 minute of CD sound, uncompressed. Dolby Digital takes up half the space as DTS, although many people notice a slight increase in quality with DTS.

Even better, in order to be compatible with current DVD-video players, many DVD-Audio discs come with the program already compressed into Dolby Digital! All you have to do is use your favorite DVD ripper to extract the audio information, and then save the soundtrack!

As far as the mp3 market? What use are they going to have for anything comparable to DVD-Audio? Most people listen to mp3s on their computers. Most people have awful computer speakers... The mp3 market simply will not care. Chances are, any DVD-Audio rips on napster will be off of the stereo version of the disk, reduced in resolution to 16bit/48khz. That will be a slight increase in quality over current mp3s, which are 16bit/44khz, and a slight increase in file size. Most people will never notice the difference, nor will they care.

Those who can afford to play back 6-channel DVD-Audio rips, (because the speakers will be more expensive,) will also be able to afford the expanded bandwidth needed for the larger files.

Andrew Rondeau