Something About Audio Coding Formats: A Weird Experiment Idea

WARNING: This post is not on linguistics, cognitive science, or computer science, but rather on acoustics and audio tech.

I’m a geek when it comes to audio—linguistically and otherwise. I love the gear, the science, and the experience. Most of my audio science experience comes from recording and listening to music, vinyls, and iPods, then hoarding lossless albums. Naturally, I also consume a lot of content about audio science and tech.

Recently, I stumbled across a YouTube video titled “Early Experiments: Encoding Images with Sound“ from the channel ScrollingMusic, and I was very intrigued, to say the least. In the video, sound was used to encode images—a concept that had me thinking about how we perceive audio quality, not just aurally, but visually. This inspired me to try something I’d always been curious about: Could I see, visually, the difference between lossy and lossless audio formats?

One famous example of hidden images in audio spectrograms is from C418, the composer behind Minecraft’s music. He embedded subtle images, like the numbers “12418” and a “Steve” face, in the spectrogram of Disc 11. Seeing that made me wonder: How would these kinds of images look if encoded in lossy formats like MP3? Would the hidden message survive, or would compression distort it?

Audio Codecs: Lossy vs. Lossless

Before diving into my experiment, let’s go over the basics of audio codecs. Codecs are algorithms used to compress and decompress digital audio files, coming in two types: lossy and lossless.

Lossy Compression: This approach is used by formats like MP3 and AAC, removes parts of the audio data deemed “inaudible” to save space. While it can drastically reduce file size, it may also reduce sound quality, especially if overcompressed. A spectrogram of an MP3 file might show missing or distorted frequencies, as the codec has selectively trimmed the data.
Lossless Compression: FLAC and WAV are examples of lossless codecs. These formats preserve all the audio data, keeping the file identical to the original source. Lossless files tend to be larger, but a spectrogram of one should show a full range of frequencies, with no data lost during compression.

The Experiment: Testing Audio Codecs Visually

My experiment is straightforward. I’ll encode a hidden message or simple visual pattern in various audio formats, aiming to reveal the differences between lossy and lossless compression. My plan is to create a short sound clip with a hidden image or phrase embedded in its spectrogram, then export it in formats like MP3 and FLAC. By comparing spectrograms side-by-side, I’ll see how much detail each format retains. Will the message remain clear in FLAC but look distorted in MP3? The spectrogram should provide visual evidence.

I expect that the lossless formats (like FLAC) will preserve the message with little to no distortion, while the lossy ones may lose clarity or even scramble the message, depending on the compression level. This might offer a new perspective on how much detail we lose with lossy codecs—not just audibly, but visually.

Future Directions: Testing Acoustic Fidelity in the Lab

After conducting this initial experiment, I’m planning a follow-up to take it a step further. I have access to a noise-isolated laboratory at my workplace, which might allow me to conduct a more controlled test by directly encoding and decoding the audio through a wired setup. By bypassing environmental noise, I could examine the acoustic differences in even greater detail, comparing what we might visually and audibly detect with high-precision equipment.

If all goes to plan, I’ll share the results in a follow-up post. Stay tuned to see just how much your favorite MP3 files might be hiding—or rather, missing…

My midterms seem to have kept me busy for the past two weeks, but stay tuned!

12.11.24

Ali Çağan Kaya