Pixion’s Audio Output
Hello Pixioneers!
Today, let’s dive into the fascinating realm of sound: a world full of vibrations, codecs, and frequencies! I’ll be honest with you: I have almost no knowledge about “audio stuff.” My experience? Extracting a few .wav files from CDs and .ogg files from game folders, converting them into .mp3 via a free online tool while clicking random settings without really knowing what I was doing.
I’ve always used the built-in speakers on my TV or monitor. I’ve never owned any kind of fancy, alien, hi-tech, multi-channel amplifier box. In short: I come from far away in terms of audio expertise. That’s also why, back in my earlier architecture post, I wasn’t sure at all what direction I wanted to take on this front.
So… let’s embark on this auditory journey together.
1. File Format
Before anything plays, we need to decide what we’re playing. That means choosing a file format.
From what I’ve read, audio files generally fall into three main categories:
Type | Quality Impact | Size Impact | Examples |
---|---|---|---|
Uncompressed | Highest quality (if recorded properly) | Heavy | . wav, .aiff, .pcm, .raw |
Lossless compression | No quality loss | Lighter | .flac, .ape, .wv, .m4a (can be lossless or lossy, depending on codec) |
Lossy compression | Quality reduced (though often imperceptibly) | Lightest | .mp3, .aac, .ogg, .opus |
Nowadays, flash memory is cheap and SD cards with lots of space are everywhere. So I’m not too concerned about file size. For simplicity, I’ve decided to go with .wav files. They’re easy to read, widely supported, and don’t require any decompression algorithm. That’s perfect for the microcontroller.
Of course, if I ever want to tinker with decoders later (and I probably will ^^), I’ll still be able to explore formats like .mp3 or .ogg. But to get things running, .wav is a good first step.
Just a quick technical note: a .wav file begins with a 44-byte header that contains info like the format, sample rate, and number of channels. After that comes the raw sound data itself. We’ll get into the details of how that works when we look at the embedded software side of things.
2. Audio generation
Now that we know what we want to play, let’s shift our focus to how we’re going to make that sound a reality — and not just data sitting on an SD card.
Each sample in a .wav file is a digital representation of an audio waveform’s amplitude at a given point in time. So, in theory, we just need to convert these values into voltages, send them to a speaker, and voilà … sound!
Right? … Right?
Well… not quite. As usual, reality is a bit more complicated. Let’s keep going on our exploratory audio quest.
From Digital to Audible
We’ve seen how the audio data is stored and how we can get it into the microcontroller’s memory. But digital data can’t move air on its own. To be heard, it needs to be transformed into physical vibrations, in other words, a sound wave.
Traditionally, this meant using a Digital-to-Analog Converter (DAC) to generate an analog signal, and sending that out through connectors like 3.5 mm jacks, RCA, XLR, or DIN. The audio device on the other end (a speaker, headphones, etc.) would do the rest.
But today, many modern audio devices (TVs, speakers, wireless earbuds, etc.) already include their own DACs. When that’s the case, it can be more efficient to send the audio as a digital signal, reducing noise and preserving quality over long cables. Common connectors for digital audio include USB, Toslink (optical), and sometimes even RCA or XLR.
And of course, we can also send sound wirelessly via Bluetooth, Wi-Fi, and other radio-based protocols.
For now, I’m keeping things simple. Pixion will start with analog output via a 3.5mm stereo jack. I don’t have exotic gear, and I’m not aiming for audiophile-grade performance as I just want to make sound happen cleanly. That said, I’ll likely leave room for an extension board, so I can add digital or wireless output later on, if the need or curiosity arises.
Audio Amplification
The output of a DAC, especially one built into a microcontroller, isn’t strong enough to drive a speaker or even most headphones directly. We need to amplify it.
As I started learning about amplifiers, I discovered that there are multiple types, categorized into different “classes”, each with its own trade-offs in terms of efficiency, quality, and complexity. Here’s a simplified overview of what I found:
Class | Operation | Efficiency | Audio Quality | Strengths | Weaknesses |
---|---|---|---|---|---|
A | Always conducting | ~25–30% | Excellent | Ultra low distortion | Hot, inefficient |
B | Push-pull halves | ~50–60% | Moderate | Simple | Crossover distortion |
AB | Overlapping push-pull | ~50–70% | Very good | Good balance | Still gets warm |
D | PWM switching | >90% | Good → Very Good | Efficient, compact | Needs filtering |
G | Multiple power rails | ~60–75% | Very good | Improved efficiency | Complex design |
H | Variable power supply | ~70–80% | Very good | Dynamic power scaling | Complex design |
I’m sure there are more out there (such as class-T, etc.), but these are the most common I came across.
At first, I thought about designing my own amplifier. It sounded fun on paper but let’s be honest: this project is already ambitious, and audio isn’t the part that excites me the most. So I’ve decided to use an off-the-shelf amplifier for now.
That said, I still want the option to experiment with custom amps in the future. To keep things flexible, I’ll add a connector to the board so I can plug in my own amplifier module later on.
Matching Power and Impedance
Choosing an amplifier is only half the story: we also need to know what kind of load it’s going to drive. In audio, that mostly means knowing its impedance.
From what I’ve found, headphone impedance can range from 8 to 600 ohms, though most commonly it’s 16 or 32 ohms. Loudspeakers usually fall between 4 and 8 ohms, sometimes up to 16 ohms for specific designs.
The Chosen One
So what amplifier did I choose?
After all that research, I’ve selected the MAX98357A. It’s a compact Class D amplifier that outputs up to 3W into a 4–8 ohm speaker, with an efficiency of around 92%. It only offers mono output, so if we ever want stereo, we’ll need to use two of them.
It’s compatible with 3.3V logic, which makes it a great match for modern microcontrollers. Even better, it takes digital audio in via I²S, a protocol designed specifically for high-quality audio transfer. That means I won’t need a separate DAC, and I²S will also help keep noise down on the PCB.
Also… I just wanted to experiment with I²S. 🙂
Wrapping Up
Phew. That was a lot. I’ll admit this article is a bit less structured than usual. The audio world is a deep, winding place, and I’m still finding my bearings. But we’ve covered the essentials to go from .wav file to real-world sound, and that’s a big step forward.
In the next post, we’ll tackle the controller so we can finally interact with our little Pixion. Until then, keep exploring.
Pix’