BFG 10K: Procedural Audio

In this post I will explore the relatively underused method of implementing a procedural audio approach to drive elements of game audio rather than the more widely used ‘data-driven’ approach of using many individual sound files. There seems to be much uncertainty and scepticism surrounding PA in the game audio community, I will attempt to explore whether this method could be a viable option to replace or augment audio systems in the current production paradigm.

In the video games world procedural audio (PA) refers to the computational process of generating audio from nothing, or almost nothing [1]. In an interview, legendary sound designer and computer scientist Andy Farnell describes PA as “a philosophy about sound being a process and not data.” [2].

The main principle behind this system is rather than focussing on creating audio assets, the sound designer focuses on creating a system by which these assets are modelled. In a game context, rather than storing many wav files in RAM or on the game disk, audio assets are created at run-time and are modulated and controlled in real time using game parameters. The end goal is to create a convincing system which models equivalent audio data to pre-recorded files but is constructed using little to no wav data.

Methodologies

According to Nicolas Fournel, Principal Audio Programmer at Sony Computing Europe, there are two main paradigms when designing a procedural audio system [3].

Teleological Modelling (Bottom-up approach)

This is the process of creating an asset model from the ground up based on real world laws of physics. Of the two methodologies it is the most challenging to create because it requires an in-depth knowledge of synthesis and sound production mechanisms (physics, mechanics, anatomy etc.).

With a system such as this, Andy Farnell suggests that many of the tedious object interactions that sound designer would have to create (environmental sounds such as wind, object impact noises etc.) would be automatically created allowing the sound designer to focus on much more ‘emotionally significant’ sounds such as character weaponry [2].

Nikunj Raghuvanshi, a researcher in field of physically-based modelling, argues that this type of methodology would result in a loss of artistic control for the sound designer [4]. Farnell, however proposes that a slight structural re-shuffling of game audio departments would mean that one main sound designer would preside over a team of programmers, guiding the process from a more aesthetic perspective.

Although this method can create realistic results, it is computationally very demanding and when implemented into a game situation, where CPU cycles are shared between various other systems, does not yet provide an efficient enough replacement for a data-driven approach. With further funding and research however, it is not unreasonable to predict that a much more efficient system could be constructed.

Ontogenetic Modelling (Top-down approach)

Using this approach, a base-sound for the required asset is provided by the sound designer which is analysed and deconstructed in order to attain the characteristics of the sound. This data is then used as a template to partially or fully re-create the sound using various synthesis techniques [5]. This method allows the sound designer to work in a very similar work flow to existing techniques and it would therefore require very little re-structuring to implement this into existing game design methodologies.

Of the two methods, the top-down approach is slightly more CPU friendly and because a model is provided to form the base for sound construction, less specialised knowledge is required to create audio assets.

Advantages

Procedural design offers an alternative to the widely adopted method ‘data driven’ system of using many different wav files to sonify the game environment. At a glance, it has many advantages over using pre-recorded material [3]:

It saves memory by using code instead of wav data

Using wav files for the means of sound reproduction means either streaming these from disk (usually reserved for longer, looping sounds and music), or loading them into RAM for playback at run-time. With either of these methods the sound designer will be competing for space with the graphics components and game engine and will thus have to make a compromise in the sounds they produce [4]. When using a PA approach all audio assets are synthesised at run-time and thus require a fraction of the disk and memory space and when compared to using wav data.

Better response to game physics

The nature of PA to be created in real-time allows the system to respond in a much more realistic way to the physics system of the game. Requirements for certain sounds can be calculated and synthesised according to game parameters allowing the two components to be more synchronised. This is particularly useful for creating sound in response to rolling, sliding or scraping within the game environment where a ‘data driven’ approach may be too clumsy to recreate the interaction realistically.

Reduces repetition

Due to the creation of an ‘asset model’ rather than just an asset, PA is perfect for creating game sounds and textures that more closely mimic real-life situations. The use of a more granular, low-level structure means that sound interactions could be modelled to ensure that the same sound is never heard more than once. This is something which is highly sought after by sound designers but is almost impossible (with current technology) using the ‘data driven’ approach. One-shot sound effects such as footsteps, gunshots and impacts can benefit hugely from a procedural approach as these types of sound are constantly heard by the player and any noticeable repetition can break the immersion of the game.

Obstacles

It may seem like a no-brainer to adopt PA for many audio systems in interactive games but in reality the solution is not quite that simple. There are many issues that developers of PA systems have to overcome before it will become a viable option for wide-scale integration into Interactive games. A few of the main obstacles include:

Variable CPU cost

One of the main advantages of PA, this is also the cause of one of its main problems. The nature of dynamically created audio to be variable means that it can be incredibly hard to predict the ‘cost’ (in terms of system resources) of producing a certain sound prior to execution. A dynamic method requires that the cost of an operation must be predicted in advance and system resources allocated accordingly. Farnell describes that a system in which the sound ‘gracefully degrades’ depending on available resources could overcome such obstacles [6].

Lack of Skills and tools of existing Sound Designers

Currently in the games industry there is a huge investment of skills and knowledge with existing tools and Farnell suggests that “even sound designers who are comfortable with progressive technology feel threatened by the need to adapt their skills and learn new tools.” [6]. Before a new, potentially disruptive technology such as this could be widely adopted, suitable tool-chains would have to be developed and training given to existing sound designers.

Fear factor

As mentioned in the opening section of the post, at present there is much aversion to Procedural Audio methods in the game audio community. There are fears among many sound designers that a method such as this will replace them and that large corporate companies will see it as an opportunity to ‘streamline’ their production process by disbanding the audio department. Many of these fears are unfounded and as mentioned previously the integration of PA systems could potentially allow the sound designer to focus on more important and relevant tasks.

Conclusion

There is no doubt that using a procedural approach to generate game audio has many benefits. It can model a closer relationship between game objects and potentially free the sound designer from having to produce many tedious game-world interactions. It seems that it may require a small re-structuring of existing production methodologies but it would still allow the sound designer to have creative control over content produced.

Unfortunately it seems that before PA can reach an acceptable level for widespread integration into games, much research still needs to be carried out to create more efficient and better sounding models. Andy Farnell equates it to the 3D graphics of early first person shooters. “Once games did not have super 3D graphics, early titles like Wolfenstein and Quake were basically box walled mazes covered in low resolution textures Synthetic sound is stuck at an equivalent stage of development, mainly because it has been excluded and neglected for 15 years.”.

I believe that with real investment and development procedural audio offers an excellent addition to the game sound designer’s tool box. If it is viewed as a tool to augment current systems of audio reproduction and not as a direct replacement the acceptance of this new technology does not seem so daunting. It is a technique which has already been used to great effect in several popular titles such as Spore [7] and Crackdown 1 and 2 [4]. If current scepticism and opposition can be overcome I believe there is definitely a place for procedural audio among current and next generation games.

References

http://www.develop-online.net/tools-and-tech/procedural-audio-with-unity/011743
Stevens, Richard, and Dave Raybould. The Game Audio Tutorial: A Practical Guide to Sound and Music for Interactive Games / Richard Stevens, Dave Raybould. Amsterdam ; Boston : Focal Press/Elsevier, c2011.
http://www.procedural-audio.com/papers/GDC%202011%20-%20Audio%20Boot%20Camp.pdf
Nikunj Raghuvanshi. (2011). Sound Synthesis in CRACKDOWN 2 and Wave Acoustics for Games. Available:http://www.gdcvault.com/play/1014416/Sound-Synthesis-in-CRACKDOWN-2.
Rutherford, S., 2008. Procedural Methods for Audio Generation in Interactive Games. Available at: http://medcontent.metapress.com/index/A65RM03P4874243N.pdf [Accessed November 4, 2013].
Farnell, A., 2007. An introduction to procedural audio and its application in computer games . , (September), pp.1–31.
http://spaceoddity.sgsgames.com/?p=799

BFG 10K

Pages

Wednesday, 6 November 2013

Procedural Audio

No comments:

Post a Comment