What Is Spatial Audio
What is spatial audio? Apple is increasingly marketing spatial audio as a key feature in new products. So, let’s take a look at spatial audio and how it works.
“You’ve never heard music sound this good.” The hype felt familiar. A new tech product being described by a zealous tech fan. Not that Apple’s new headphones don’t sound good. But this kind of hype makes me wonder how many $500 headphones the tech bro has listened to.
Still, there’s something going on with this and other Apple products, something we’re going to hear a lot more about if the rumours about future products are true.
So, let’s take a moment to consider what Spatial Audio is, how it makes movies and music sound good, and where this tech might be taking the future of sound.
Understanding Spatial Audio
Back in early February 2020 I was traveling through Japan. Schools where already closed and precautions increasing. With the risk of a pandemic looming, I started to think about how it might affect me. One concern was my plan to rebuild my studio might be put on hold. And I was tired of not having a space where I could practise guitar.
So, while traveling in Japan, I picked up a pair of Boss Waza-Air headphones. You plug your guitar into a wireless transmitter and put on the headphones, and it sounds like you’re in a studio playing through a real guitar amplifier. Other products do similar things. But what makes Waza-Air special is the headphones have gyroscopes in them and use special audio processing, so as you move your head, your relationship to the amplified sound changes.
There are similar products that also sound good. But they always feel like you are listening to guitar music. The Waza-Air sounds like you’re in a room with an amp. You can put the amp behind you, as if you’re on stage, then when you turn around, the sound now faces you.
That’s spatial audio. It’s pretty magical.
The History Of Spatial Audio
Attempts to create a more three-dimensional sound go back over a hundred years. Binaural devices were developed initially to help industrialist Alfred DuPont hear more clearly in board meetings, and binaural was also used to detect the location of planes and submarines.
By the late 1930s, Bell Labs had developed a system which captured sound using a dummy human head that mimicked the way we hear. It could create recordings that tracked movement around 360 degrees of the headphone-wearing listener’s perspective. They also developed a multitrack audio system that could replicate multidimensional sound in a cinema setting.
The Audio Engineering Society has a technical group for Spatial Audio, and their archive of papers tracking modern developments in spatial audio goes back to 1999.
The Technology Of Spatial Audio
Two key technologies shape how spatial audio is implemented today: impulse responses and multi-channel audio.
Music is typically available in two channels (called stereo). You listen with two speakers, or the two sides of a pair of headphones. You hear sound in front of you, or to either side of you. Some home and car systems have 2.1 channels, which means an extra speaker for low frequencies (a subwoofer), in addition to the regular pair.
Cinema systems evolved to have more channels, which allowed film makers to create the feeling of sound moving around you as you watched a movie. These extra channels gave the film sound designers the ability to not just locate a sound to the left or right of a listener but also behind them, in front of them and even above them. First a centre channel was added, to make speech stand out more. Then this new 3.1 system evolved into 5.1, then 7.1, and now Dolby Atmos gives filmmakers up to 128 channels to work with. Today’s new cinemas typically have a 7.1.4 configuration, with seven speakers around the listener, four above, and one for low frequencies.
While multi-channel audio allows you to locate a sound in your listening space, impulse responses make it possible to change the sonic characteristics so it sounds like a different space entirely.
A speaker and a microphone can map a room and the way sound behaves within it. The echo of a cathedral or concert hall can be digitally reproduced. Or the dull-sound-absorbing qualities of a radio studio can be applied to a recording.
You can even sample the sound of a room you’re in. By inverting that sound and playing it back through a pair of headphones, you cancel the background noise.
The Appeal Of Spatial Audio
Think about sound in a video game. Your game’s character is walking outside on gravel. They open a door. Then walk into a room with hard concrete floors. The sound of the footsteps needs to change – not just because our steps sound different on gravel or concrete, but also because sounds outdoors behave very differently to sounds indoors.
The quality of audio is central to the experience of visual entertainment like games or movies.
Why do the Academy Awards for sound design and editing so often go to films about conflict and war? Getting the sound right for fast-moving objects, and people moving amongst them, is tricky.
Any sound out of place pulls you from the reality of the scene. Sounds need to move logically, even as the camera pans or as characters move around.
There’s a scene in Alfonso Cuarón’s Roma where some of the characters are on the rooftop of their home in Mexico City, washing clothes. The camera does a full 360-degree pan to show other people in the neighborhood on their rooftops doing the same thing. As it does, the sound follows the camera movement. You feel as if you are in the scene, on that rooftop, looking at those people as they go about their chores.
In a cinema with a surround-sound setup, it’s a magical moment. But without the sound behaving in such a three-dimensional way, it wouldn’t be as powerful an experience.
The Challenge For Spatial Audio
Music is multichannel formats has never really taken off. There were attempts a few years back to sell CDs in surround sound years before there was quadraphonic audio, which promised a surround-like experience using four speakers (the hi-fi in my childhood home was a Pioneer Quadraphonic system).
This highlights a quirk in consumer behavior – people’s unwillingness to pay more for better sound. Consumers are prepared to upgrade their screens on a regular basis, 1080p to 4K, LCD to LED to OLED. But they’re far less likely to upgrade their sound. Few consumers are willing to spend the cost of a TV on a pair of headphones, even though the sound quality will be stunning.
Perhaps this shouldn’t surprise us. The first CDs sounded worse than vinyl. Eventually, CD quality improved. Then MP3s came along and CDs sounded worse again.
This is the challenge for companies making audio products. Consumers seem unwilling to pay for significantly better sound. But good audio quality makes music listening so much richer and is integral to the experience of visual entertainment like games or movies.
The Possibility Of Augmented Audio
There’s been a lot of buzz around augmented reality. You hold up a smartphone camera (or look at the world through smart glasses) and digital features are superimposed over the image of the reality around you. Pokémon go was an example of this. A few years ago, an app was released that allowed you to find the train station exit you wanted just by holing up your phone and following the arrows – useful for making your way out of somewhere like Shinjuku station, with over 200 exits, 35 platforms, and more than 3.5 million passengers a day!
A similar thing can be done with audio.
Imagine yourself trying to listen to a conversation in a noisy environment. Maybe a loud restaurant, a busy conference, or even a crowded train station. What if your headphones could listen and separate out the sound of the people close to you and filter out the noise in the background?
This feels as urgent now as it did for DuPont in his boardroom nearly a hundred years ago.
The tech that makes spatial audio possible can also make this kind of augmented audio possible. And while $500 for a pair of headphones might seem expensive, it’s nothing compared to the cost of making a physical space sound good.
The Cost of Good Audio Environments
I spent nearly as much as the cost of a small car on sound treatment for my studio in Tokyo. This didn’t make the room “soundproof” as most people think of it. But it did make it sound very good. Quiet, when nothing was playing. Super detailed, when the music started. And that wasn’t a lot of money for sound treatment. It was the basics.
But this is the thing about good sound: you start with the room. If we built offices and classrooms properly, then we’d start with making them acoustically optimal for the most important sound in them – the human voice.
But it would make building more expensive.
Many of the spaces we live in, work in, and travel through are far from optimal. They are noisy. They make for poor spaces to listen to music in, watch films, or sometimes even have conversations.
So it’s not hard to see why high-end headphones and spatial audio could be popular.
I would guess that’s what Apple are betting on. Given how important good sound is. And how joyous and liberating music can be.