Improved Spatial Audio from Ambisonic Surround Sound Software
Submitting Institution
University of DerbyUnit of Assessment
General EngineeringSummary Impact Type
TechnologicalResearch Subject Area(s)
Information and Computing Sciences: Artificial Intelligence and Image Processing, Information Systems
Engineering: Electrical and Electronic Engineering
Summary of the impact
    The reduction of spatial variation in the quality of reproduced sound
      within a defined space using varied loudspeaker placements is a
      significant challenge for sound engineers. Dr Bruce Wiggins has conducted
      research into encoding, decoding and processing algorithms using
      Ambisonics, a system based around full-sphere sound reproduction. The
      outcomes of the research have been made accessible to the wider community
      by the creation of a suite of software plug-ins (WigWare), a production
      workflow, and associated teaching materials which can enable commercial
      audio workstations to benefit from Ambisonics. There are numerous recorded
      instances of successful use.
    Underpinning research
    Ambisonics, pioneered in the 1980's by Michael Gerzon, is a kernel-based
      3D surround sound system. The encoding (recording or panning) of the audio
      is separated from the decoding (or rendering) of the audio to speaker
      feeds. This means that the system can be rendered to any number of
      speakers in almost any position in 3D space, as long as the positions of
      the speakers are known. Moreover, Ambisonics is a system optimised around
      a number of psycho-acoustic criteria which, when implemented, reduce the
      variability of audio no matter what speaker arrangement is used for
      reproduction. This allows for a `mix once' system where subsequent
      remixing is not necessary when replayed over different loudspeaker
      systems. The research work carried out began by giving a thorough treatise
      on Ambisonic theory, specifying the relationship and conversions from and
      to other surround systems, which led to work quantifying and optimising
      the system using measured Head Related Transfer Function (HRTF) data and
      the more standard energy and velocity vector analysis. These complex
      multiple input, multiple output system were then optimised using a
      modified heuristic Tabu search algorithm with multiple defined fitness
      functions (Wiggins 2003, 2004). This was necessary as, up to this point
      (2003), there was no known or published method of generating Ambisonic
      decoder coefficients for irregular speakers arrays. This included the
      standard 5 speaker array as specified by the International
      Telecommunication Union (ITU-R BS 775); the standard matrix pseudo-inverse
      approach gave results that were psycho-acoustically sub-optimal as the
      matrices in question became ill-conditioned. Work carried out by Gerzon
      just before his death in 1995 on this subject was shown to lead to
      sub-optimal results (Wiggins 2003). This research solved that problem.
      Subsequent work concentrated on the encoding side of Ambisonics, giving
      new insight in the performance of Ambisonic, 3D microphones with respect
      to near-field effects (distance cues), and how these could be handled in
      terms of decoding to loudspeaker arrays (Wiggins, 2009). Concurrently,
      work based on both the use of Ambisonic mixing tools in the context of
      music production and digital audio workstations allowed for the wider
      community to utilise the outputs of this research in the form of software
      Virtual Studio Technology (VST) plug-ins which implemented encoding,
      decoding and 3D reverb effects for up to 24 loudspeakers for 1st, 2nd, 3rd
      and 4th order Ambisonic systems. Note here that the order is an indication
      of the number of channels used in the encoding [(N+1)^2 where N is the
      order], with higher channel counts resulting in better spatial acuity.
      This software, named WigWare, first released in November 2005, has evolved
      to take in further work based on proximity effect, near-field compensation
      and speaker compensation. This arises from the distance of both the
      encoded, or recorded source, and the speakers used to replay the audio. To
      disseminate the work, `screencasts' documenting how to set up and use the
      software research outputs in an example Digital Audio Workstation (DAW)
      have been made publicly available (using Reaper as the DAW -
      http://reaper.fm). This work has led to a number of invited presentations,
      at DAFx 2012, Institute of Acoustics Reproduced Sound 2012 and Birmingham
      University and involvement with the BBC R&D Audio Research
      Partnership.
    The principle investigator was Bruce Wiggins who enrolled as a PhD
      student at the University of Derby in 1999, completing in 2004. The
      supervisors were Iain Paterson-Stephens and Professor Richard Thorn with
      contributing work by Dr Stuart Berry and Dr Valerie Lowndes (all staff at
      the University of Derby)
    References to the research
    
The main key outputs related to the work are:
      Wiggins, B. Paterson-Stephens, I., Lowndes, V., Berry, S. (2003) The
      Design and Optimisation of Surround Sound Decoders Using Heuristic
      Methods. Proceedings of UKSim 2003, Conference of the UK Simulation
      Society p.106-114.
      http://tinyurl.com/BruceImpact5
      (Best Quality 1 of 3)
     
Wiggins, B. (2004), An Investigation into the Real-time Manipulation and
      Control of Three-dimensional Sound Fields, PhD thesis, University of
      Derby, Derby, UK.
      
        http://tinyurl.com/BruceImpact4 (Best Quality 2 of 3)
     
Wiggins, B. (2007) The Generation of Panning Laws for Irregular Speaker
      Arrays Using Heuristic Methods. Proceedings of the 31st International AES
      conference, London, UK.
      http://tinyurl.com/BruceImpact3
      (Best Quality 3 of 3)
     
Wiggins, B. (2008) Has Ambisonics Come of Age? Reproduced Sound 24 -
      Proceedings of the Institute of Acoustics, Vol 30. Pt 6.
      
        http://tinyurl.com/BruceImpact2
     
Wiggins, B., Spenceley, T. (2009) Distance coding and performance of the
      mark 5 and st350 SoundField microphones and their suitability for
      Ambisonic reproduction. Reproduced Sound 25 — Proceeding of the Institute
      of Acoustics, Vol 31, Pt 4.
      
        http://tinyurl.com/BruceImpact1
     
Details of the impact
    After completing the PhD it became apparent that although there were a
      number of researchers working on Ambisonics, it was extremely difficult to
      utilise the published work for Audio Production and live events.
      Ambisonics was not implemented in any commercial music production
      software, meaning practitioners could not utilise the benefits of the
      system. To this end, a number of VST software plug-ins were created
      (November 2005) which implemented the encoding and decoding algorithms
      necessary for 1st and 2nd order Ambisonics. These plug-ins could be loaded
      into VST compliant DAWs allowing existing workflows to be leveraged with
      minimal changes. The plug-ins were improved and augmented, including
      graphical user interfaces, 3D reverberation, distance filtering/proximity
      effects and up to 4th order operation, to incorporate new research from
      the UoA. They have been consistently updated and expanded with bespoke
      decoders created for new projects. Both the software and a publicly
      available set of instructional videos are available at
        www.BruceWiggins.co.uk. The software has been used in a number of
      projects in both academia (from Undergraduate to PhD level work) and in
      industry with the public being exposed to the outputs of the system
      internationally.
    The reputation of the University's work in spatial audio technology led
      to an approach by Funktion One, designers of high quality sound
      reinforcement equipment, who were looking to implement spatial audio in
      large, live outdoor events. The WigWare software was used to process live
      audio, with extra software created to allow for frequency dependent
      panning effects. This joint project led to a measured benefit of lower
      noise levels off-site, and more enveloping audio on-site compared to other
      stages controlled in a standard stereo-based manner. The software is now a
      feature of Funktion One live events, known as the Funktion One
      Experimental Soundfield, with the system featured at the Glade Festival
      every year since 2006, and at the Glastonbury Festival on the Glade and
      Sprit of 71 stages from 2008. The latter included a live version of
      Tubular Bells processed by the WigWare software in 2011. In an article
      interviewing Anslem Guise, one of the founding members of the Glade
      Festival (Tpi, 2012) he openly praises the system "My highlights this
        year were the Meteor and I would say Origin this year was incredible,"
        commented Guise. "The decor from Artescape in Cape Town combined with
        the lighting and of course the amazing Ambisonic sound from Funktion-One
        really was another level." John Newsham (from Funktion One) is
      quoted in the same article "The Ambisonic system runs in Audio Mulch on
        a laptop with Bruce Wiggins' Wigware Ambisonic VST plug-ins and an Echo
        soundcard. Bruce has developed second order Ambisonic VST panners which
        are capable of delivering some stunning pan effects," (Tpi 2012).
    The research is also utilised by computer games company, Codemasters, who
      have pioneered the use of real-time Ambisonic encoding and decoding in
      their Playstation 3, Xbox 360 and PC games. They have been using the
      WigWare suite of Ambisonic plug-ins in order to create pre-mixed Ambisonic
      audio for use in their games including Colin McCrae DiRT2 (reported by
      VGChartz as 1.68M sales) and the 2011 BAFTA winning Formula 1 2010 game
      (2.3M sales reported by Trade paper MCV on 17th May 2011). The use of
      pre-mixed audio has allowed for full 3D scenes to be constructed, mixed,
      and stored using just four channels of audio, yet still allows them the
      flexibility of re-orienting the entire sound field using just a few
      multiply and add operations per sample. Different speaker layouts in the
      decoding are then easily added post mix. This has allowed an increase in
      the complexity and amount of audio content in the game as the storage of
      the 3D scene is much more efficient. It is thought that the steerable
      pre-mixed audio, i.e. premixed audio that can still take into account
      which way the camera is facing on decoding, was a world first in computer
      games, thus allowing significantly more audio content to be included on
      the DVD used to distribute Xbox games.
    The BBC's Audio R & D department have been using the research as part
      of a wider project looking into future high definition audio formats.
      Research Engineers from BBC's audio team visited Derby in 2011 in order to
      discuss Ambisonics implementation issues and production workflow. They
      also attended our Sounds in Space event the same year where Wiggins
      presented a live demonstration comparing 1st, 2nd and 3rd order Ambisonics
      using the Wigware software. This, along with the research has "helped
        members of our team to understand Ambisonics processing and how it
        sounds, allowing us to explore the area more effectively in our own
        research." With the team also using Wigware plug-ins, along with a
      number of web-based animations Wiggins generated visualising the
      differences between Ambisonic orders, to demonstrate Ambisonics as a
      surround format to colleagues at the BBC; this has informed decisions on
      the direction of research at the company. Also, the BBC have "used your
        [Wiggins'] papers ([2004] thesis and 31st AES conference) in developing
        an understanding of issues for irregular loudspeaker layout decoding
        processes, and used it to build some of our own algorithms in MATLAB for
        3D irregular decoding". The BBC's work in this area has been
      reported on TV and the web, with a screenshot of the WigWare web site
      appearing on BBC Click (18m.50s on 13-08-2011 episode, BBC, 2011).
    Further afield, the Museum of Jurassic Technology in Los Angeles
      (http://mjt.org/) has a small 14 seat cinema theatre where they are
      working on a 3D, stereoscopic, motion picture that was shot in the
      Republic of Georgia (2011). The audio was recorded with a B-format
      Ambisonic microphone and an accompanying Ambisonic Music Track has been
      produced using the WigWare plug-ins. The founder of the Museum, and the
      user of the research was awarded the MacArthur Fellowship (nicknamed the
      Genius Grant) in 2001.
    Examples of the impact of the work on live and recorded music production
      are numerous. Work at Blumlein Records in Germany not only releases
      recordings using the WigWare tools, but also advocates their use in
      magazine articles and on the internet regarding surround audio production;
      "Mixing took place in Ambisonic B-Format. I used Bruce Wiggins'
        Ambisonic panner to place every M/S decoded channel at its position
        within an octagon and WigWare's 5.1 decoder to map the B-Format to my
        ITU 5.1 monitoring environment. When editing the session alongside the
        composer, Oliver [Korte] was amazed by the precise localisation of every
        sound source around us. Quite an achievement considering the - in
        Ambisonic terms - irregular placement of the loudspeakers." (Levine,
      2011). A surround DVD released in February 2013 entitled Oliver Korte:
      Elemente is a typical example of work produced using the software
      (AllMusic. 2013) with reviews available, in German, at Korte-oliver.de.
      (2013). One such review states "...it is included on DVD in a
        breath-taking 5.1-surround version...". On Tuesday 3rd July 2012,
      Bob Beldon's Animation band performed tracks from the Transparent Heart
      album which was performed in live 8 speaker Ambisonics using the WigWare
      software to provide Ambisonic reverberation in real time. A review of the
      evening stated "Coupled to Ambisonic's pioneering live surround sound
        set up that was manipulated live by its creator Serafino DiRosario...
        the immersive audiovisuals only served to heighten the intense feelings
        at the heart of Belden's visionary urban jazz aesthetic." (Flynn,
      2012).
    A number of researchers are utilising the research worldwide, such as at
      Queensland University of Technology in Australia where the Wigware
      software is used to provide live panning and decoding in the production of
      a Theatrical work with 3D audio (Wilkinson, 2013). A recent PhD student at
      Queen Mary's University London used the software in a project with the BBC
      (Morrell et al, 2012) in order to create a hybrid 3D audio rendering
      system combined with surround vision. In addition, a composer, orchestra
      conductor and researcher at the University for the Creative Arts is
      utilising the software in their PhD in music (Abras, 2013). Wiggins also
      generated the irregular Ambisonic decoder coefficients used in the CSound
      Ambisonic Decoder op-code (Furse et al. 2008). This computer programming
      language for sound has been downloaded more than 310,000 times since
      January 2008 (source: Sourceforge.net).
    Sources to corroborate the impact 
    Abras, J. 2013. 
www.juanmanuelabras.com. [online] Available at:
    
http://www.juanmanuelabras.com/
    [Accessed: 18 Oct 2013].
    
AllMusic. 2013. Oliver Korte: Elemente - Various Artists | Release
        Credits | AllMusic. [online] Available at: http://www.allmusic.com/album/release/oliver-korte-elemente-mr0003887094/credits
      [Accessed: 18 Oct 2013]
    BBC, 2011. BBC Click Episode 13-08-2011 [online] Available at:
    http://www.bbc.co.uk/programmes/b013pdnf, still image found here http://twitpic.com/65ana6
      Accessed: 18 Oct 2013]
    Flynn, M. 2012. Jazz breaking news: Bob Belden's Animation Dive Into
        The Dark Side Of Manhattan. [online] Available at: http://www.jazzwisemagazine.com/news-mainmenu-139/69-2012/12416-jazz-breaking-news-bob-beldens-animation-dive-into-the-dark-side-of-manhattan
      [Accessed: 18 Oct 2013].
    Furse, R., Wiggins, B., Adriaensen, F. and Groner, S. 2008. bformdec1.
      [online] Available at:
      http://www.csounds.com/manual/html/bformdec1.html
      [Accessed: 18 Oct 2013].
    Korte-oliver.de. 2013. [korte-oliver.de] Prof. Dr. Oliver Korte:
        Komponist und Musiktheoretiker: Presse zur CD/DVD "Elemente".
      [online] Available at: http://www.korte-oliver.de/werke/kritiken/kritiken-view/presse-zur-cddvd-elemente/
      [Accessed: 18 Oct 2013].
    Levine, A. (2011) Rediscovering Ambisonics, Resolution (Audio for
      Broadcast, Post, Recording and Multimedia Production), pp47-48 V10.3 April
      2011
    Morrell, M., Baume, C., Reiss, J. Vambu Sound: A Mixed Techniuque 4-D
      Reproduction System with a Heightened Frontal Localisation Area. Spatial
      Audio In Today's 3d World - AES 25th Uk Conference, Universtiy of York,
      2012
    TPI. 2012. Electronic Wonderland. [online] Available at: http://www.tpimagazine.com/production-profiles/1589162/electronic_wonderland.html
      [Accessed: 18 Oct 2013]
    Wilkinson, J. 2013. Notes on Creating a Sound Play. [online]
      Available at:
      
        http://ambisonicsinthetopend1945.wordpress.com/ [Accessed: 18 Oct
      2013].