The importance of audio in augmented and virtual reality

A review of the Welcome to the Real World - Immersive Audio for Augmented and Virtual Reality event

The uptake and interest in virtual, augmented and mixed reality across a broad range of application areas, from gaming to engineering design, music production to health and journalism, is significant, exciting and a huge opportunity for creatives, technologists and storytellers - and no doubt many more besides. But to deliver engaging and transformative experiences that have real impact on their intended audiences, clever, interactive, immersive images and visualisations are only part of what is needed – underpinning the whole immersive experience, blending across all other elements, and placing an audience or individual at the centre of the VR/AR/MR world, is the accompanying sound design. Filmmakers have of course known this since the advent of the first ‘talkies’, and much of the language and technique of cinematic sound is being translated – often via the medium of computer game sound design – across to immersive content creation. This has been made possible in no small way through the development of many new spatial audio tools and production techniques to help with the immersive audio content workflow. This spatial audio content works closely with 3D VR visuals to engage both our sight and hearing to deliver a complete and immersive experience of a virtual world. Much of this immersive audio technology is based on many years of leading research and development that is now finding its way into commonly used audio production workflows as well as interactive game engines and creative music applications, and the UK can rightly claim to have been leading in this area over many years.


The focus of the Welcome to the Real World - Immersive Audio for Augmented and Virtual Reality event in May 2017 was to highlight some of the UK’s most exciting current work in immersive audio research, development and content design. It brought together nearly 100 leading researchers and practitioners from academic, industry and creative sectors and provided an overview of the immersive audio creative process, considered application workflows and the potential for new developments. It also created a platform for demonstrating some of the most exciting immersive audio experiences created so far by leading teams around the UK.


Jelle van Mourik was our first speaker, an audio programmer at the London Studio of Sony Interactive Entertainment. He has worked on several game titles, but most prominently, VR Worlds, a collection of five virtual reality experiences designed to showcase PlayStation VR – in fact it was released at the same time as the Sony PlayStation VR headset. Their creative aim is to build larger than life experiences, but their design goal is to never break the sense of immersion in their games. Jelle pointed out note that 95% of their audience use headphones for audio playback, and the team use natural mixing and realistic reverb to help build an audio scene. The result is often not 100% ‘real’ but is consistent with the images presented to the game player. Audio is used to ground the player in the game world, and is often designed into the gameplay itself, helping to locate objects or opponent players.


Gavin Kearney of the AudioLab at the University of York gave a whistle-stop overview of the diverse immersive audio research their team are involved with. The AudioLab’s work encompasses improved pipeline development for Google and YouTube360, immersive VR for ensemble singing performance, audio content and exhibition design for museums, environmental soundscape capture and rendering, and immersive music recording and production. As Gavin pointed out, the immersive technology market is large and hence a significant amount of new audio content will need to be produced to support it.


Gareth Llewelyn is a leading sound designer, and co-founder of Mixed Immersion – a full-service creative 3D audio production company - and specialises in immersive audio production for VR, including 3D cinema mixes for a number of leading film titles. In 2009, Gareth helped mix the very first commercial Immersive Audio release for Skywalker Sound. He continues to pioneer new creative and technical approaches to making high quality audio work with a true depth of audience immersion, telling stories with sound using whatever technology is available to facilitate the creative end point. He presented a number of case study examples from his portfolio and touched on the problems of on-set immersive audio capture. Much more control is required on a scene, especially if dialogue is involved, and there is a heavy reliance on radio microphones, wireless foldback and on-set spatial audio capture, together with a need to ensure all equipment and microphones are well hidden out of sight! He pointed out that most 360-degree cameras are not yet up to the job of professional production, and the humble clapperboard is still vital for ensuring everything is in sync. His final comments advised that as the field develops, the deliverables will change, becoming much more complex, with a knock-on effect for the associated production budgets involved.


After a quick break, Catherine Robinson from the BBC gave an overview of immersive audio in broadcast, and in particular using BBC R&D’s spatial audio tools for craft production in TV, Radio and 360-video. Catherine specialises in sound design for radio drama, binaural audio and 3D sound for 360-video and VR, and created the sound design and binaural mix for Ring, a horror drama for Radio 4. Following the success of this broadcast Catherine went on to set up the first operational 3D sound studio in the BBC outside R&D. She talked about her experiences of working on many other immersive audio broadcasts for the BBC, noting that people do now listen communally with headphones – although despite this, it is still important to ensure that audiences listen with their headphones on the right way round! Similarly to Gareth she reflected on the added complexity of the delivery pipeline, noting that different platforms require different renders and that she made use of the audio definition platform to apply metadata to source content for different end uses. Much of the audience were very interested in her experience of working on the then very recent Dr Who episode rendered for binaural listening and made downloadable via iPlayer. Catherine also noted how very claustrophobic normal stereo sounds after having worked for so long in 3D which many of the audience could agree with.


Taking a somewhat different approach to the production and use of immersive sound, Alex Southern presented next on the use of immersive audio for engineering and acoustic design. Alex is the Auralisation Lead at AECOM, and an experienced research scientist publishing and specializing for over 10 years in spatial audio, soundfield modelling, auralisation and related topics. Alex provides specialist technical support and innovation to the AECOM Acoustic team for both buildings and environmental projects as part of the Immersive Practitioners Group. Alex spoke generally about how the sound of our environment influences our health and wellbeing, defining aualisation as the audio equivalent to 3D visualisation. His main focus was on how immersive audio and auralisation might be used for critical engineering design. In other application areas, shortcuts might be made to arrive at the end point experience – but this cannot be done in an engineering context: the immersive (audio) experience must be calibrated and rendered to as closely as possible model reality for effective inclusion as part of the engineering design process.


Our final presentation was given by Garry Haywood on the Art Gallery as R&D Space. Garry Haywood is the CEO of Kinicho, a start-up with a key focus on spatial audio for VR and AR, and his presentation explored how Kinicho used an arts centre as an effective platform for commercial R&D. Garry started off with the comment that headphone audio itself is an immersive experience, without any need for an additional VR headset, and posed the question of how do we deliver precision audio for critical listening applications within five years. Aside from the technical aspects of immersive audio, Garry also discussed some of the artistic projects Kinicho have focused on recently, including work for Culture Lab, Newcastle, a three-day live installation at The Sage Gateshead and work for Hull City of Culture 2017. He made the interesting point that when collaborating with artists on such high profile projects they are usually more than willing to fail, but simultaneously want to deliver perfection for their audience.


A lively Q&A was held with our presenters, followed by a further break before heading into the final, formal, part of the event which brought together a diverse group of audio experts working in immersive technologies from across the sector. Oliver Kadel represented 1.618 Digital a company specialising in 360-film and gaming experiences, end-to-end production and sonic branding. He made the point that today's audiences have not as yet developed a taste for spatial fidelity - timbral fidelity is instead – or still - the key factor. Jon Eades from world famous recording studio Abbey Road made the same point - that fidelity is really important for the music listening process - and made clear the somewhat different needs demanded when developing 3D music content. He made the point that existing binaural codecs are not so great in this respect, and that the product (3D music) ultimately doesn’t – yet – sell itself. Will Buchanan from RPPtv is working on various solutions for media production, particularly in the area of sound effects for sound design, including immersive content. Working with end users to understand their needs is key, and there are limits to how to build engaging, complex 360 immersive environments. This is where better, model driven sound design is needed. Michael Kelly is Director of R&D at Xperi DTS Inc. and he considered the difficulty of developing “real” content – gaming is a relatively easy creative endpoint to manage in terms of immersive content, but real world capture and rendering is much more complex to deal with, with many workflow challenges to surmount.


The panel discussion was opened up to questions from the assembled delegates. Audiences were a key subject that the panel kept returning to – are they the main problem in ensuring a successful and wider uptake, and in determining that immersive content is not just a gimmick? Educating audiences was seen to be important – a considered effort will be needed to promote the technology and the experiences immersive content can enable. Communal and social VR experiences were thought to be a particularly important area to get right – so that the experience is not individual or isolating. However, as always, content is key – it needs not to be overcomplicated, but bring extra richness to the medium; to strike the right balance between interactivity and passivity; to increase empathy of experience through the impact of the immersive sound working with other aspects. All were agreed that we want to leave our audiences with an emotive experience that they remember and want to keep coming back to.


Upon bringing the panel session to a close, the event was concluded with time for networking, plus an opportunity to try out demos from both Mixed Immersion, and Kinicho, the latter who were just about able to squeeze their COSMOS portable 3rd Order Ambisonics Array into one of the demo spaces. This was a great event for Immerse UK - the first to bring together and promote the huge array of talent and interest the UK has in audio and music technology, and how it has a huge role to play in the development of immersive experiences, technology, content, research and development.


Thanks go to Dave Black of Mixed Immersion for his help and drive in bringing the event together, and to Reed Smith for hosting us in their amazing venue at the top of the Broadgate Tower.


Blog by Dr. Damian Murphy, AudioLab, University of York