This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.
Is spatial audio the missing link in enterprise VR?
Cinema goers are familiar with the experience of being enveloped by sound. This is made possible by spatial or immersive audio, which is designed to recreate the real-world experience of how people hear.
The concept is also applied to virtual and augmented reality (VR/AR), adding another dimension to virtualised environments and used for a variety of non-entertainment applications – including healthcare and the military.
The Audio Engineering Society (AES) describes spatial sound as “an essential underlying technology for VR and AR” that can be used to deliver not only a “sense of reality” but also “hyper-reality”.
For virtualisation in enterprise applications, spatial audio offers the same immersive qualities consumers experience in films, games, and VR/AR experiences at theme parks – plus a high degree of isolation, blocking out extraneous noise and allowing the user to hear only what is relevant to what they are doing.
Object-based audio: a brief history
The main technologies for spatial sound are object-based audio (OBA), binaural and Ambisonics.
Dolby Atmos is the most common OBA system currently in use and is based on a combination of audio channels – either 5.1 or 7.1 – and multiple ‘objects’, which can be placed or moved anywhere in a sound picture.
The earlier 5.1 systems, including Dolby Digital, produced surround sound that covered the length and width of a cinema. Atmos, and other OBA systems, add the dimension of height through the addition of ceiling loudspeakers, making for a more immersive experience.
As a side note, the terms ‘spatial audio’ and ‘immersive audio’ are often used interchangeably but there is a subtle difference between them; spatial is a technological term, while immersive refers to the experiential aspects of a system.
The main difference between OBA and the other main spatial systems is that the audio image is created in a mixing studio from multiple individual tracks, while binaural and Ambisonics capture the sound in a space as it happens.
Binaural was the earliest attempt to reproduce sound as it is heard by the human ears, dating back to the late 19th century. The process traditionally involves a dummy head fitted with two microphones, one for the left ear and one for the right, which pick up all the sounds- and reflections – in each environment.
This produces a very accurate sonic image but one which can only be listened to on headphones, with the individual sounds fixed to their original recorded positions. This initially made binaural unsuitable for VR – but this problem has been solved by the development of head-tracking systems, which are now found, with spatial audio, on Meta Quest (formerly Oculus) and Microsoft Hololens headsets.
Ambisonics was developed in the early 1970s by mathematician and tape recorder enthusiast Michael Gerzon with the aim of going beyond what he saw as the restrictions of stereo.
Gerzon’s theory was that proper spatial imaging could only be achieved if the acoustical signals in the recording environment were captured. Identifying what he called the ‘soundfield’ as comprising the absolute sound pressure level and the three pressure gradients (left/right, front/back and up/down), Gerzon designed a specialised four-capsule microphone to pick up a true multi-dimensional audio image.
The SoundField mic is still used today, manufactured by Australian company RØDE as the NT-SF1. Other Ambisonic microphones aimed at VR work include Sennheiser’s AMBEO, the Nevaton and the VRH-8 capsule for the H8 recorder and the mic array on the H3-VR recorder from Zoom (not the video conferencing system).
Binaural dummy heads and microphones are manufactured by Brüel & Kjær (part of HBK), Neumann (owned by Sennheiser) and Binaural Enthusiast. Alternative – and cheaper – options without the heads are Sound Professionals’ in-ear MS-TFB-2 and the 3Dio EM-172 binaural microphones.
Dolby says it is not currently looking at audio for VR in the enterprise sectors. It is possible there could be future implementation because Atmos is an option on both Microsoft Hololens 2 and Meta Quest 2, along with what both companies describe as ‘spatial audio’.
Getting a headset in enterprise
These headsets are being used in healthcare today but usually without the audio component, although spatial sound was used as far back as 2018 for the Stanford Virtual Heart VR project based round the then Oculus headset.
The military is also beginning to deploy spatial sound VR simulations to add more realism to simulated combat situations for training purposes.
To provide this additional dimension, security and defence contractor QinetiQ has established a dedicated team to work on immersive technologies, including VR, AR and mixed reality (XR), for a range of military applications, including training and simulation.
David Taylor, capability lead for immersive technologies at QinetiQ, explains that because much of its recent work in XR is based on the Unity and Unreal games engines, audio spatializers are used to represent the sources of sound associated with the images.
Both Unity and Unreal can work with binaural and Ambisonics, either natively or through plug-ins.
“We use the relevant plug-in supplied for the game engine we’re working in at the time,” says Taylor.
“For the Unity platform, when deploying to a Meta Quest device, we would use the Oculus Audio Spatializer. The ability to correctly represent the sound of a vehicle, aircraft or even wildlife in an environment and then reinforce that with the correct audio characteristics contributes towards the overall effect to the user of a virtual environment.”
Directional sound cues can also provide some real value from a training perspective, such as attempting to provide distraction by other devices or even people/avatars, reflecting a real-world scenario.”
Taylor highlights the value of having ambient sound in an XR space, which enables the user to be surrounded with audio and reinforces the overall experience.
“Rarely is a real-world space completely silent and neither should a virtual space be,” he says.
“Using spatial sound can also better simulate a location – such as the rear of a vehicle or aircraft – even when the user is in a very different physical location, like a training room.”
In many ways, these are still relatively early days in the use of spatial audio for VR and AR in the enterprise, healthcare, and military sectors. But as people already know what it has added to films, TV, and games, it cannot be long before it is more widely used elsewhere.
#BeInformed
Subscribe to our Editor's weekly newsletter