A Game That Listens — The Audio of INSIDE
Speakers: Martin Stig Andersen (Independent) [http://www.martinstigandersen.dk/]
“I now really enjoy dying.”
Martin Stig Andersen is a human being who breathes air, eats food and drinks water, just like the rest of us–but unlike you and I, he was responsible for the audio of LIMBO.
Martin was in tow at this year’s Game Developer’s Conference to give an advance look at some of the sound design and systems behind Playdead’s upcoming title, INSIDE, as well as spend some time at the Audiokinetic booth on the show floor showing off all the knobs and wires.
And look, I don’t want to oversell his talents too much, because there’s a nonzero chance that he’ll read this at some point and it could be really weird when I run into him down the line. It already is, to an extent. For a lot of people. After a few years of realizing that everyone behind your favorite game experiences are just people, you cool it with the crazy sycophantic fandom stuff. Mostly. But Martin’s got a really, really sharp mind for sound design and interactive audio, and I wouldn’t be the first person to tell him so.
I kinda feel bad for him, actually; there was, even in the rushed fifteen-minutes-on-a-Friday time I hung around the booth, this revolving Who’s Who cast of sound designers that’d never hire me, asking questions and pointing at the screen and talking coolly to Martin about their thoughts. Satisfied, they would end this exchange in this self-aware explosion of hurried fandom on how incredible his work is and how important LIMBO was to them and gosh, I dunno, you’re just a genius, and thank you so much for doing everything you do, Martin, before shaking his hand and dashing off.
Maybe there’s another article in me sometime about how damn weird it must be to be on that other side, or on the etiquette of admitting prior knowledge of a person and how much you love them when you’re finally face to face on the conference floor. I do think about it a bunch, and usually just tell people to chill.
For now I should get to brass tacks on what Martin and Playdead are up to and how you might learn from it.
A Brief History
First off, get familiar: if you haven’t played LIMBO, what the fuck are you doing here?
You can either play it now, or after this article makes it realize you should, but it’d probably inform the read a little bit.
Martin’s work on LIMBO set a lot of firsts for game audio, or at least, went public with them in a way that let their influence really start to bear on this generation of sound designers. On the design front, LIMBO’s soundscape is a perfect match for the stark, blurred-out out monochrome of the game’s visuals. Sounds move from muffled, wire recorder-processed bits of texture to full-frequency moments of danger that leap to the fore. The similarly blurred line between what’s sound / what’s music is tread almost invisibly through the game thanks to Martin’s nearly Musique Concrete approach of tuning everything to fit together. There’s a talk on this from GDC 2011–watch it.
Beyond its aesthetic accomplishments, LIMBO was the poster child of How You Should Do Things In Wwise for many years. The game released in 2009; the program Martin used to stitch all this logic together, I bet most of us would barely recognize. But he sure as heck accomplished. Credit where it’s due, Playdead seems to have given him a really wonderful suite of debug tools with which to draw bounds and place events for determining mix states throughout the game’s stages. This mix what glues it all together is consistently auto-attenuating, resetting, evolving as the player adventures onwards. Watch the talks I link at the bottom. These were shown in Martin’s 2012 lecture on implementation–again, watch the talks.
So yeah man, this game was a big deal.
Shortly after completing LIMBO, Playdead scooped up audio programmer Jakob Schmid in a really smart first-round draft. Playdead’s lead gameplay designer, Jeppe Carlsen, set off to explore the Unity game engine and try his hand at a different sort of platformer, bringing along Schmid, as well as motion graphics designers Niels Fyrst Lausdahl and Andreas Arnild Peitersen. They’d create what eventually became 140, a visually mesmerizing platformer in which each level’s elements (blocks, elevators, etc.) were choreographed by the game’s soundtrack, knitting it all together in a really cohesive and trippy package. See for yourself!
Years later, with these audio engine-directed learnings in place, INSIDE’s in full production, and we get to where we are now. Headline! Inside INSIDE: Playdead Comes Alive.
Get It Right The Second Time
Nothing’s ever perfect in the mind of its creator. Martin revealed that one of his biggest regrets with LIMBO was the way the death / respawn interrupted the flow of the game’s soundscape. If you weren’t particularly bothered by this, it’s probably because this is the way we expect games to work. When we fade to black, things go quiet; when we fade up, all is new again. Games flush cache and reload variables across moments like this, and lots about what the audio engine knows or is doing goes along for the ride. It’s bog standard.
Thing is, in LIMBO, you die a lot. And Martin had very specific goals for its sound experience–an ever-evolving, blend of literal and acousmatic sound effects, textures and pads–that clash with the way the audio engine’s impersistence. Imagine the Boy’s deaths and the player’s experience of the world as one, unbroken timeline, and you can start to see Martin’s perspective: you intend for some sound design, or a particular mix state to play just at the player’s first encounter with an area, and these serve as passage into some new part of the overall composition. You’d rather not back out of it until the player’s advanced, and every death before the next checkpoint is part of that same section of the piece. Retriggering these cues “because Memory Management” goes against artistic intent and makes things feel much more “game”y than they need to.
This time around, they set the sound engine to start at the beginning of the game, and never, ever unload until the game has finished. Everything the audio engine knows, and everything the game knows about that rolls all the way through, untouched until the game is shut off. This means voices don’t need to cut off, playback position can be remembered, mix states and other game object-specific variables can be adjusted with the long view in mind.
We do something similar at PopCap with our music system, firing off one event at sound engine initialization and letting it roll until the session’s done or interrupted. The game’s music evolves through State changes over time. Mostly, we do this because guaranteeing the music’s on lets us steer it mid-flight with Events we’ve created ourselves, instead of relying on developers to trigger those cues. Also, cycling things in and out of memory costs CPU, which is itself a precious commodity on mobile.
So this idea of keeping the audio engine alive forever is not new–nor are a lot of the bullet points that make up the technical considerations of INSIDE.
Yet, there’s something magical in the way they’re all brought together.
How This Sounded
To illustrate the shortcomings he saw in LIMBO, Martin presented a couple of gameplay clips of the boy dying and respawning with their original audio in place–and then a few clips of the same sequence with their audio replaced to allow the music to play through.
The Befores and Afters here are subtle. Music cues which fired off just as a player first encountered a challenge (hopping across the electrified HOTEL sign) happened just once, the first time, and the rest of the piece continued through every fade to black as the boy died a handful of times in trying to cross this section. It lent a really nice continuity to the segment that wasn’t there when everything had to restart on each reload.
A practical example of this on INSIDE takes place in one of the game’s many puzzles. Martin demonstrated a section in which the Boy’s got to flip a breaker switch that turns out a few dangerous, rotating sentry lights that will kill him if crossed, but whose light he nonetheless needs to proceed.
Those lights sounded dangerous as all get-out; really great, full-spectrum stuff. Activating them drives a railroad spike through what’s been a cautiously filtered mix up until this point. They seize and hold your attention, and they are meant to: Don’t fuck with these lights.
[pullquote align=”left”]Martin quipped that these were “such small changes that the player [was] unlikely to notice them,” but that’s a very knowing statement.[/pullquote]
But the focus shifts thereafter to the Boy and how you’ll help him through this. So, in a persistent mix decision that leverages this always-on audio engine, the game begins to attenuate those lights on subsequent deaths and reloads. Even though the game objects associated with those sounds are trashed in the interim, those variables are kept alive in the sound engine, to which the game looks for its cues.
We saw this same phenomenon in LIMBO, with an incredibly layered attenuation system that slowly swung the volume of the boy’s footsteps up and down within a 15dB window, irrespective of camera distance and entirely for this effect.
What happened there, and what’ll happen with INSIDE, is that all of these subtle changes sum up to larger changes you feel, even if you can’t quite put your finger on them. This is the mix as conversation with the player than some unweighted diegetic reality, broadcast out.
Again, there’s no one showstopper here, it’s the concert of everything working together. You start to see what five years’ tuning will get you.
Listening to What?
That sounds all well and good, but the name of the talk–what exactly is the game listening to?
INSIDE’s been structured so that many of its events and logic are fired and evaluated totally within the context of the audio engine, which has some really rad creative consequences I’ll think out loud about in a bit. For starters, this top-down audio-driven vision required that “game time”–which is usually measured in frames, and can swing variably up and down depending on system performance–has instead been locked to Wwise’s metronome.
The BPM of INSIDE’s soundtrack (and I use the term loosely, as none of it seems so obviously musical) is the actual beating heart of the game. You can easily envision how this would work in something like Rock Band, or 140, but INSIDE is very different from those games. It’s bleak, desaturated and looks very much like it rose from the bones of LIMBO. That’s another way of saying, it’s not flashing note charts and neon elevators in your face–but the all-powerful thrum of its soundscape in just as much in control.
Here’s an example.
In one scene, the Boy is in a factory, sneaking between the rank and file of a bunch of deactivated automata. There’s a conspicuous space between two of them, and as you step into it, a track-mounted security drone snaps to life from just above, latching you with a spotlight. The steady pulse of the factory swells into a rhythm; the line of machines begins to march. It’s The Sorcerer’s Apprentice meets Pink Floyd’s “The Wall,” and the player’s got to keep lockstep with the beat to avoid being snicked by the drone.
[pullquote align=”right”]There are no scrolling meters or anything to let you know that this is how this section needs to be played–by now, the game has taught you to listen.[/pullquote]
Four bars of a march, four bars of silence. The drones stop, you stop. The drones move, you move. Just like that.
There are no scrolling meters or anything to let you know that this is how this section needs to be played–by now, the game has taught you to listen.
(Actually, I’m wondering how the hearing-impaired are ever supposed to play this game. Or even those without some extended low-end going on in their living room, because GODDAMN this game is about the well-crafted sub.)
Here, as everywhere, the soundscape rolls straight through your horrifying understated death. If you were to start tapping your foot to the march of this level, you could keep it up until you finished. Probably beyond, actually. I think Martin mentioned that they wanted music to transition between areas as well. (Of course…)
“But, it takes a few seconds to pull the curtains back up and drop you into that robot conga line for another try. What if I respawn right in the middle of a marching part?”
Glad you asked, simulated reader!
At the point when you die, the game engine’s checking the playback head over the march’s Music Playlist container to evaluate how much time will have passed by the time the player’s back in control after being revived.
///I don't actually code and this is why if((Start of Don't Move Section < (Music.PlaybackLocation + RespawnTime) < End Of Don't Move Section)) KeepPlaying!; else SetSwitch.SoundEngine.AdditionalRespawnTime;
That looked a lot cooler when it existed only in my brain. Basically though, if the game’s going to respawn you during a march, it instead segues into a two bar transition of more Don’t Move soundscape that buys you some time to react. Bear in mind again that the game can know these timings intimately, because game logic time has been slaved to music time; no matter how crummy your system, the game and Wwise are in constant deterministic communication. “Hey, garbage collection’s done–ready to reload. Where are we?”
Here’s another cool one.
There’s a part of the game in which a distant wave cannon is firing into the foreground, directly towards the screen. Like the way you’d throw guys in Turtles in Time.
As each pulse lands, things splinter and smash, sheet metal flies away, the screen shakes. You’ve got to run the Boy left to right, up and down, from cover to cover within the merciful intervals in which this weapon reloads.
No one called this thing a “wave cannon,” but in starting to write this, it occurred to me that I don’t know what sort of blast this thing’s actually firing. And that’s because there’s no visual cue that shows you each of these pulses coming in, only the aftermath their impacts leave on the environment.
Here again, you’re forced to listen to survive. All of the logic surrounding whether the Boy lives or dies is slaved to the metronome of the level.
The metronome, in this case, is quite literally the entire envelope of this terrifying weapon sound, whose variations are implemented as a Playlist within the Interactive Music Hierarchy.
Just like with the march, the playlist container of the wave cannon envelope’s dotted with annotations that send information out to the engine: trigger blast effects, shake the screen, check to see if the Boy’s behind cover or not. The last one’s key, because the game is actually checking to see if you’re behind cover or not at the peak of the sound but before all the visuals of the sound rippling through the foreground fire off.
I think this delay was baked in to allow a little grace window for the player to scramble, but it means that you could be behind cover at the moment of this check, duck out into the open right afterwards and be found safe, even as you stand amidst all the flying wreckage.
Disconnect // Net Effect
In a lot of games, that scenario’d seem like a bug. But, I find this moment of potential disconnect between sound and visuals super interesting, because it actually serves to strengthen the connection between sound and survival.
Surviving despite the visuals telling you you shouldn’t have lets the player know: It’s not the impact what kills you. It’s the sound.
The sentry doesn’t kill you–the march does.
I haven’t played a minute of INSIDE, and I’ve only seen what Martin’s been willing to show, but I would bet you that the game’s full of tiny little asynchronicities that cement this relationship over the course of the game. Moments that cut out the middleman and teach the player that sound is the gentle jury and dispassionate executioner.
Sound doesn’t point to the reward, the feeling that you’re safe–it quite literally means those things.
A positive music transition didn’t happen because you passed a checkpoint, it IS the checkpoint.
Maybe? Could be that I’m giving the guy too much credit, but I kinda stand with @mattesque:
The Expo Floor
On the last day of GDC, I finally made time for the Expo Floor, and just enough to get to the Audiokinetic booth. It happened to be during one of Martin’s demo slots. It was, as the rhapsodic quality of this article has likely made clear, one of the highlights of my conference–not because of what it contained, but because of how it made me feel about myself afterwards.
I don’t have a great extended metaphor to wrap around what it’s like to check out a new Wwise project from the creator of LIMBO; suffice to say, a cloned copy of LIMBO’s Wwise project, dirty production laundry and hacks and all, is a checkbox you can elect to grab when you’re installing a new build of Wwise. Like, the way you’d download a Hendrix guitar simulator or how you page through Richard Devine presets in every fucking sound design tool you use, that’s where Martin is with Wwise. It takes a certain mind.
Anyways, I expected to see some super next-level shit and come away feeling like I needed to get a thankless job aboard some Alaskan fishing boat, so destroyed was my confidence.
But actually? I kinda understood everything that was going on. I think you will, too.
Even having seen the wiring in the walls, the thing that struck me about INSIDE’s implementation wasn’t its complexity–it was its elegance.
You know when you build a standalone Wwise demo for a reel, and because you don’t have all the live inputs a game would actually have, you have to rig up all sorts of little tricks to simulate the Soundcaster session feeling like the real thing might? (Shameless plug.) Well, the INSIDE Wwise project felt kinda like Martin did that, and then everyone else just made the game work around it.
Or like a track layout after you’ve gone in and pruned out all those garbage layers that weren’t helping anything. Clean, purposeful, only what needed to be there.
Some cool stuff that’s going on:
- Physics Is It: Animations are not tagged. Besides footsteps, anyways. Player limb physics velocities and angles are measured and sent in as RTPCs, which flip switches on constantly-running events that trigger bits of movement coming through when thresholds are crossed. Bet you this matrixes with the material system in place to make sure the right sound plays every time. Several games are doing this and have done it already! Just a reminder that if you can, you should, too. Where is the line between “footsteps” and “foot-based physics system impacts” drawn, anyways?
- Beat Your Feet: Player breath events are fired according to the player’s run speed: the game measures the frequency of the Boy’s footfalls and generates a rough BPM out of that, firing breath requests off at the intervals that naturally emerge. This keeps the breaths consistently in sync with the player’s movement, as well as lets things settle into the same rhythm we do when we’re running and breathing out in the wild.
- It’s the Little Things: Running consistently puts the player into various states of exhaustion, which switch breath audio into more exasperated-sounding sample sets, as well as probably adjust the trigger rate to some degree. There’s a cooldown period thereafter as the player recovers. (It wouldn’t shock me if they’re driving the Boy’s exhausted animations from these RTPCs, rather than the other way around. But I didn’t ask.)
- What They’re Meant For: Blend Containers everywhere. I’ll be honest, I haven’t used one of these things for more than “play a bunch of stuff at once” holders for a long time, now, but it doesn’t shock me that they’re aplenty in place in INSIDE, and used exactly to effect. Take water movement: Martin had the Boy leap onto a rope above a pool of water, shimmying down it until he could swing back and forth across the water’s surface. A bunch of light, lapping swishes play as you skim the shallows. Descend further, and those movements blend into a sample set of deeper water movements without skipping a beat.
- Little Things, Vol. 2: The Boy’s wetness is tracked after he leaves the water, contributing to a sweetener layer on footsteps and movements (material-specific, of course) until, after enough movement, that’s gone down to zero. I bet you there are even blend containers there for various states of wet-dry balance.
One More Thing
If someone asks you about the audio of INSIDE and you wanna get ’em excited about it really quick, drop this bomb:
“[Its] sound design’s in the Interactive Music Hierarchy, and its music lives in the Actor-Mixer Hierarchy.”
Okay, not all of it is. Like basically all the movement sounds I just listed up there are probably not.
But as the natural counterpart to that wave cannon’s spot in a Music Playlist, much of the game’s ambient music is fired off within the Actor-Mixer Hierarchy. Annotatations on the looping playlists of the game’s ‘music’ (read: its sounds) are sent out to the engine and back in as Play Events that trip random containers full of variably trigger-rated pads and bits of texture. INSIDE’s music, basically, is set up like you and I would set up unseen ambience emitters. Neat trick!
I had a few questions for Martin that I wasn’t able to ask during the talk, because there were lots of others who went before me. So, I had an extra day or two to work them over in my head before I got the chance to pose them. By Friday, I felt pretty confident about what his answers would be, especially so after having toured the Wwise project–which was laid out so sensibly, refined down to such a point that all the cool things it was doing felt pretty self-evident.
Just the same, I had to know a couple of things:
Q: What do you with five years’ worth of time on a game? What takes the longest–the asset creation, establishment of the style, systems and implementation?
A: “Yes, [the implementation].”
This squares with my experience as well. Yes, sound design and palette establishment can take a while, ditto the core engine setup, but I’ve worked on a couple of Wwise projects, and you don’t go straight from concept to getting them looking like that. Everything was so spartan and deliberate; you could tell that this was a project where they got to go back after all the messy experimentation and do exactly what they needed to from ground up.
It creates the appearance of unbroken genius from start to finish, but the only thing unbroken about it’s the intent, really.
We may never have the time to fully refactor all of our experiments into project-wide elegance, but it’s a reminder to upkeep as you go. As projects stretch onwards, hacks bite back.
And the big one.
Q: In a game where everything’s looking to the audio to make its judgment calls, the sound designer wields a tremendous amount of power: you shorten one of those wave cannon loops or move an annotation, and suddenly, the level becomes impossible to pass.
This top-down causality makes you kind of a Game Designer on INSIDE.
How was that balancing act with the rest of the team? Were there a lot of interdisciplinary power struggles and overstepped boundaries? Or did you all just kind of work it out amicably for the sake of the game?
A: “Yes, [we just worked it out].”
I mean, there you have it.
It strikes me that on larger team or in more egoistic environments, this kind of thing wouldn’t fly. People can get precious about their work, and this subversion of the trickle-down flow where audio’s usually the last to know and the most affected–well, it’d take a certain somebody to ease that paradigm shift through. Martin strikes me as that kind of guy. If you’ve listened in to any of this year’s Sightglass discussions, the hearsay on how audio managed to work out such a sweet deal on this game came down to a
“If Martin wants it, Martin gets it”
… sort of vibe within the team.
But, I don’t think it needs to be that way.
Take this, from this year’s talk on the Music and Sound Design of Ori and the Blind Forest–
That’s to say, if you’ve got a team that’s willing to have the hard discussions, battle it out, put the game before all else–all of these little struggles eventually become scraps of forgotten noise.
I don’t doubt that Playdead works much like this. But who can say!
I’m still not sure of what to say here. There’s no one big secret weapon trick I’ve walked away from my time with INSIDE. If I had to stretch for some global ones:
- If you weren’t convinced that Proactive Audio were gonna be a thing yet, well. Even Playdead’s doing it!
- Animation tagging stuff sucks, it’d slipped my mind for a while that physics-based movement is a much better way to go
As a personal takeaway, though?
The entire experience of this talk, where Martin laid out the showstopper points of what he’s doing–and the Wwise project, where you got to see all the less glamorous finishing touches that nonetheless adorn the game–left me feeling, well, strangely empowered. It’s been five years since I last saw Martin speak, and I left that talk with my head in a million pieces. Maybe I’ve learned a few things since then, because I was able to keep it well together this time around. It’s a credit to the way he thinks about sound, and to the accessible degree everything’s been refined, that I listened to and looked at all this stuff and just felt like, “I get it.”
Made me feel like maybe I could do something down the line that’d make someone else feel the same way.
As a parting shot, then, maybe this is a reminder to acknowledge the way you’ll walk away from this article having understood a few things–and to give yourself credit for that.
None of us is Martin, but even Martin’s got to brush his teeth in the morning.
We’ll all make a thing like this someday if we just stick with it.
Get Yer Stig On
As a coda to all this, here’s a solid collection of Stig-related resources for you to geek over.
- A recent video interview with him:
- Here are the publically un-vaulted talks Martin delivered at GDC 2011 / 2012
- A growing collection of many, many LIMBO-related pieces of content:
- You can see the LIMBO Wwise project for yourself! Check the appropriate box when installing:
- Here’s a nice Gamasutra write-up on the creation of 140:
- INSIDE’s got some press material up that you should go and see:
And finally, a Designing Sound interview with the man himself:
Hey Luca, thanks for the kind words – it’s interesting to hear your takeaways from our demo. If someone should be interested in more info about the implementation, here are the slides from my talk at AES Audio for Games conference 2016. http://www.schmid.dk/talks/2016-02-11-aes/schmid-AES_2016-The_Boy_from_INSIDE.pdf
Jakob, thanks so much for checking this out! The whole experience with INSIDE left quite an impression on me, writing it all up was just a way to recodify whatever lessons I’d learned for the future.
I’ll absolutely take a look at the slides. If you’ve spotted any inaccuracies here, lemme know–really just working off my notes and memories. Cheers on the great work!