Luca Fusi Sound Design | Implementation

28Apr/140

Wwise, Unity and Starting Something

I've spent the last few months on contract break / forced sabbatical from my time at Microsoft. And through the professional void, it's been personally fruitful. Thanks to living like an antisocial monk for most of 2013, I'd put away enough to take a long trip into Southeast Asia and wander about for a month.

(That deserves its own post--which it may or may not get--but you can view my efforts at photojournaling the whole thing over on my Instagram. It starts here, and I wish there were an easier way to reverse-chronologically browse this thing.)

Travel led into more travel: I got to take a trip to the Italian homeland with my dad and brother for a week's skiing, eating and pacing around downtown Rome. Then GDC. Then, a few weeks later, the annual VALVE Hawaii trip, which I'd been invited along to as a guest. I'm really blessed to have been able to live out this downtime as I have.

But amidst all the vacationing, the overactive brain wanders. You gotta feed it or it dies.

I've thought for a while that a real safe heading for game audio is the career path of the audio programmer. In my last year's experience on Spark, I can tell you that their time is an incredibly precious commodity. If you, the intrepid Sound Designer and Implementer, are the dreamer of big things, they are the ones that turn those dreams into executable reality. I don't care how good you are with Wwise or Unity or whatever, on any game of sufficient scope, and if you're trying to do anything that'd stand out against the forward-rushing edge of game audio, you will need a programmer's help. Sometimes, though, you won't get it.

What do you do then?

As preparation for a hopeful and glorious return to pay-stubbed game audio--and because I have a little game I'd like to make someday--I'll endeavor to decode some of this low-level magic that these guys do. And, jointly because I want to keep myself on rails and give you all something to read about, I'll be documenting what I find, showing my work, demystifying everything I can.

The simplest of sandboxes seems like a ready-made project where I can poke into some Wwise-Unity integration and figure out exactly what's going on. I know Wwise well enough and there's documentation on that particular spot where the middleware hits the engine.

Here's a mission statement of sorts:

I want to hook a Wwise project directly to a game engine, preferably Unity. This means taking a Wwise project with in-built RTPCs, Switches etc. and creating brand new hooks to them within the game code, compiling and experiencing the audio moving about.

Starter questions:

  • Can I do this via an already built Unity game simply integrating a Wwise project into it?
  • What languages would I need to learn to do it?

I really don't know anything about programming beyond some basic batch scripting stuff and a well-rusted primer on Python, courtesty of my time at VFS. So, expect a lot of frustration, doing things without really understanding how they're working and, hopefully, lightbulbs coming on.

Step 1's checking out the Wwise-Unity integration package and seeing what the deal with it is.

26Mar/140

GDC2014

Hello! It's been a minute. Lots to catch up on--it's probably best to just jump into present day and go from there.

Another Game Developer's Conference has come and gone, and I wanted to make sense of the whole experience, commit it to print before the day-to-day sinks back in. Let's take it point for point.

The People

If I've said it once...

The best thing about the game industry are the people within it. This is my second year as a semi-credentialed, guess-I-belong-here attendee of GDC, going by that AAA name on my conference pass--but the people of game audio have been welcoming for as long as I've had intent to join them. They're humble, kind and--thanks to the tireless #GameAudioGDC banner-flying of @lostlab--extremely visible at the conference itself.

Something I saw this year was a lot of folks going Expo Pass only, saving some scratch and eschewing the body of the conference for the networking fringe: hallway meetups and late-night business idea shares over overpriced drinks. When you've got a group as organized as game audio, it works. Each morning's Game Audio Podcast meetup at Sightglass was an informal chance to mull over the day's talks and go all wide-eyed about the future alongside all manner of rookies and vets. It's so fucking cool that the group's that close-knit, and I really need to thank Damian and Anton for setting that stuff up every morning.

My heart goes out to all the underrepresented disciplines who don't have that same social leadership, as hanging with these guys is always the best part of the conference.

The Talks

Of course, there was a lot to watch and hear that you could only get to with a badge. Everyone I spoke with agrees that GDC2014's talks were a notch up: ferociously technical and full of stuff you wanted to run back and put into practice. I've outlined two specific favorites below.

Two of the most-talked about presentations on the Audio Track talks were delivered one after another on Wednesday morning--and both by audio programmers. Tools, systems and implementation hooks are sexy, and a development team whose culture supports these things is one of the surest components of a great sounding game.

Aural Immersion: Audio Technology in The Last of Us

Jonathan Lanier's an audio programmer at Naughty Dog (do they have more than one? The luxury!) who spoke on the systems that went into the incredible sound of The Last of Us. That one was my game of the year--in an age when I'm spoiled for choice and spend far too much time considering, but not actively engaging with, my Steam catalog, TLoU had me running home from work to fire up the console and running my mouth around the coffee machine every morning with stories of the last night's play. Lanier outlined the Creative and Audio Directors' early pre-production talks, which set audio up for development support and eventual success, before digging into the technical ins and outs.

The audio team was able to ground their audio in the gritty realism of the world by hitching a ride on Naughty Dog's tried and tested raycast engine. This let them throw lines and cones around every crumbling environment, bringing back useful information that let them filter, verb out and otherwise treat their sound. In a game where you spend so much time crouching and listening, the sum of all these subtle treatments made for some incredibly tense pre-combat situations: planning my approach as unseen Clickers shambled and squealed somewhere off in the dark, or straining just a little bit to hear Ellie and realizing I'd jogged too far ahead.

What's important is that the team never became slaves to their own systems, putting the technique above the telling. They tried out HDR--the silver bullet audio solution of 2013--and found it didn't fit the type of perspective they were trying to put you in. So they rolled their own dynamic mixing solution. They liked the way enemy chatter faded out over distance, but that same falloff curve meant some key dialogue with Ellie could go unintelligible. So they they sent enemy and friendly NPC dialogue through separately adjustable wet/dry treatments and reverb buses.

TLoU's audio tech is impressive, but nothing any AAA studio couldn't have dreamed up themselves. It's the fact that they got so much of it into the game--and had a studio that believed in audio; that gave them the resources to do all of that--that turned it into the greatest sounding game of the year.

The only shitty thing about this talk is that it was double-scheduled alongside A Context-Aware Character Dialog System. So, you had to pick one or another--but not both. One to watch on the Vault later on.

The Sound of Grand Theft Auto V

This was the Audio Track talk that sidelined everyone this year: Alastair MacGregor's an audio programmer from Rockstar who brought with him an overview of what it took to accomplish the sound of Grand Theft Auto V. I feel Rockstar doesn't often go public about their methods and techniques--as Anton said in the podcast, Alastair's name on the program felt like "someone from Rockstar being let outdoors"--but I don't think anyone expected them to reveal what they ended up showing.

GTAV features around 90+ hours of recorded dialogue, heaps of licensed music and sound design in what is almost certainly the audio budget record-breaker of last generation. All of this was powered by Rockstar's internal audio toolset, RAGE. It's maintained and developed by a team of audio programmers and sound designers that seem to be staffed there independent of any specific game project, e.g. they're a dedicated team. They've been iterating and improving upon RAGE from around the time of Grand Theft Auto V, making RAGE--now versioned 3.0--at least five years in the making.

RAGE is insanely comprehensive in what it facilitates; it reads like a game audio Christmas list fulfilled. Thankfully, volunteers and event management were on hand to scrape flying chunks of blown mind off the walls as Alastair touched upon feature after feature. Here are a few highlights; you'll want to try to catch the talk or someone else's summary for more, because there was more.

GTAV didn't even ship on PS4, ergo: there is and will be more.

How RAGE Wins Everything

Synchronicity 3.0
When the team started running up against the wall of lining up microfragments of weapon audio and trigger timings, the RAGE team responded. The engine allows for sub-frame (e.g. more than once per 1/30th second, or, more frequently than most stuff in the game's ever making a call), synchronous, sample accurate triggering of multiple assets in different formats. Designers could stack one gun layer in uncompressed PCM, another wrapped in XMA--which would need a little decoding--and the engine accounts for this, keeping everything locked up. Did I mention that GTA was so filled to capacity that the architects had to load audio into the PS3's video RAM to hit their goals? They did, and RAGE buffers for the transfer time out of video memory and still keeps things locked.

Better Engines, Cheaper
GTAV's cars sound much better than its precedessor's. (I don't know this for sure. Haven't played GTAV yet! But, I'm taking Alastair's word for it.) Beyond simple loops, each instance of a car in GTAV is kitted out with not one, but two granular synthesizers--one for processing engine sounds, another for exhaust--that help to split source recordings into tiny, reassemble-able grains at runtime, stretching their audio further and reduce memory usage. Naturally, RAGE features a nice, graphical interface for the audio designers to tune these synths in and offers fine control, e.g. what sections of a specific sample to granulate, how to blend between these areas to create convincing idle transitions (which, as steady, non-pitching sounds are typically poor candidates for granulation). They're even able to specify a % number of grains to use from each section to get really gritty about memory usage; get the sound believable, then start paring the complexity back and ride that fine line. Thoughtful options like this mean that these synthesizers can run with brutal efficiency, so that even the CPU load of double instances per car--and the game features a lot of cars--make for an effective tradeoff vs. loading fatter loops into memory. GTAV's programmers are seventh-dan master of the Cell processor architecture.

Like Promethean Fire
There's lots of talk about procedural audio these days: sounds spun up entirely out of oscillators and code, costing very little memory at the expense of some CPU usage. The idea is that at their best, procedural sound can free up valuable memory for larger, necessarily manmade assets like voiceover and orchestral music by covering all the little bits that you don't need to maybe get sounding 100% realistic. Footsteps, physics sounds, etc. At least, that's where most of us have been setting the near-term bar, because even making those sorts of sounds out of thin air is really freaking hard to do. The general consensus has been that procedural audio is coming, but isn't quite ready just yet.

Except that fully 30% of the sound effects in GTAV were created using RAGE's procedural audio editor.

Fucking 30%. Of a game that large. That shipped on the last generation.

Alastair spent some time demonstrating RAGE's modular synth-like interface that helped make this possible. It allows their audio designers to craft and tinker towards a procedural sound asset before exporting that synthesizer configuration as an asset that can run in-game. He auditioned a few that might as well have come from a microphone; apparently, Rockstar's sound designers are pretty much all Neo. This part of the talk thrust me through the full ten stages of denial and I eventually came around to stunned bewilderment.

tl;dr Rockstar's audio tech is years ahead of everyone and we all had no idea.

Everything Else

Gosh, there's still so much to go over. FMOD, Wwise and Fabric battling down to become the de facto indie audio solution of the future, just as Unity spools up its own tech. Unreal Engine coming down from its status as a AAA platform to meet the little guys with a cheapish subscription model, and throwing back the curtain on Blueprint, its new visual scripting tool for quickly creating awesome looking stuff.

It was a week of ingestion whose digestion continues. I'll likely have more to say once the whole of the conference hits the online Vault. The plan is to kick back and nerd it up with some coworkers, catch all the stuff we missed from the Audio Track and beyond. I'm sure there's lots in there that'd equally inspire.

For now, it's time to cool my spending, crack into a side project or two and thank everyone who made last week so amazing.

#GameAudioGDC is a truly happy place.