Multimedia: the impact of Content-based Multi-media algorithms

Submitting Institution

Goldsmiths' College

Unit of Assessment

Computer Science and Informatics

Summary Impact Type

Societal

Research Subject Area(s)

Information and Computing Sciences: Artificial Intelligence and Image Processing, Information Systems


Download original

PDF

Summary of the impact

This study describes some of the impact that has come out of our long-running research on content-based multi-media algorithms. We enumerate three particular routes to impact: Adding value to large archives of multimedia, giving a voice to disabled musicians, commercialising the research through games and media companies.

Underpinning research

Michael Casey joined Goldsmiths as an SL in 2004 and has been employed here ever since; he was promoted to Professor, and has been 0.2 fte since 2008. He joined Goldsmiths, having co-edited MPEG7 audio tools, and started developing new algorithms in content-based multimedia information retrieval. These have radically improved efficiency over the then standard algorithms (from polynomial time to sub-linear time search). He used these in developing libraries for music and multimedia content-based matching [SoundSpotter, AudioDB, Bregman, ACTION].

Mick Grierson is a Senior Lecturer: he was appointed as a 0.5 fte Lecturer in 2008 and was made full-time in 2010. He and Casey started working together in 2008, creating real-time interactive audiovisual software systems for sound, music and image analysis [1]. Grierson later used this work as a basis for his own libraries for real-time audiovisual interaction. Here we describe two strands of research that led to very different kinds of impact, both stemming from this work on multimedia algorithms: content-based search and participant design with disabled communities.

Content-based Search: Content-based search methods, such as locality sensitive hashing (LSH), have many free parameters and are difficult to optimise. Casey, sponsored by Yahoo! research, developed algorithms through a new Minimum Distances approach that adapts the nearest-neighbour strategy by limiting the search space using an optimal radius bound determined by sampling of a dataset. The method efficiently solves a range of common retrieval tasks by providing optimal parameters that can be used in low-distortion embedding schemes such as locality-sensitive hashing (LSH). [2] This leads to an easy implementation of multimedia search with retrieval times several orders of magnitude faster than those using exhaustive distance computations or non-optimal LSH parameters, [3]. Casey's method is compatible with most existing multimedia features and is used for big data content-based search optimisation in industrial scale applications.

Participatory Research with Disabled Children: Grierson extended the work described in [1] in collaboration with Whitefield School and Centre, the largest Special Educational Needs (SEN) school in Europe. He worked with children with multiple disabilities (including deafness, motor difficulties and autism) to develop a system called LumiSonic that enabled deaf children to interact, by way of wireless mobile sensors, with sound and music using visualisations of audio features [4]. To support quick cycles of software prototyping, testing and improvement, Grierson developed a library, called Maximilian, of C++ rapid prototyping tools for real-time signals analysis, visualisation and digital signal processing. [5] This allowed software to be developed quickly, working with disabled children. Children were given prototypes and encouraged to engage in specific tasks, evolving from discrete interactions (such as recording and playing back a sound) to more complex interaction involving continuous fine motor control (e.g. for sound manipulation and adjustment).

Through a series of iterations with users, Grierson designed systems and devices that were usable by the entire disabled user group. For example, while touchscreen-based systems were fine for some, they proved frustrating for users with severely restricted mobility or cognitive disabilities. For these children, iOS device sensors interpreted movement data directly from participants' existing wheelchair interfaces. For children with cognitive disabilities, he created a tactile human computer interface, called a Noisebear: a bespoke, wireless, low-voltage pressure sensitive mass that uses machine learning techniques to adapt to the individual users' needs [6]. As far as the children are concerned it is a cuddly toy which they can move and manipulate to control the sound and music.

References to the research

The international quality of the research is evidenced through the publication of results in highly-regarded rigorous journals ([3], for example, is published in the journal ranked 2nd in its field) and the quality of the software ([5] is an output in this REF submission)

1. Casey, M. & Grierson, M. "Soundspotter / Remix-TV: Fast Approximate Matching for Audio and Video Performance", in Proceedings of the International Computer Music Conference, Copenhagen, Denmark, 2007. [Available online or from the research office]

2. Slaney M & Casey, M., Locality-sensitive hashing for finding nearest neighbors, IEEE Signal Processing Magazine, 25 (2), pp. 128-131, March 2008. DOI: 10.1109/MSP.2007.914237

 
 
 
 

3. Casey, M., Rhodes, C. and Slaney. (2008). M. Analysis of Minimum Distances in High-Dimensional Musical Spaces, IEEE Transactions on Audio, Speech and Language Processing, Vol. 16 No. 5 pp. 1015-1028. DOI: 10.1109/TASL.2008.925883

 
 
 
 

4. Grierson M. `Making Music with Images: Interactive Audiovisual Performance Systems for the Deaf', International Journal on Disability and Human Development. Volume 10, Issue 1, Pages 37-41, 2011. DOI: 10.1515/ijdhd.2011.009.

 
 
 

5. First described in Grierson M & Kiefer C (2011) Maximillian: An easy to use, cross platform C++ Toolkit for interactive audio and synthesis applications." In Proceedings of the International Computer Music Conference, Huddersfield, UK. [REF output; details in REF 2b]

6. Grierson M. & Kiefer C (2013) NoiseBear: A Wireless Malleable Multiparametric Controller for use in Assistive Technology Contexts. In CHI '13 Extended Abstracts on Human Factors in Computing Systems (CHI EA '13), 2923-2926. ACM, NY, US. DOI: 10.1145/2468356.2479575.

 

Funded Research Projects that relate to the underpinning research:

AHRC grant AH/D000602/1 Cognitive and Structural Approaches to Contemporary Audiovisual Computer Aided Composition, 2006-2010; AHRC grant AH/H038264/1 Sound, Image and the Brain: Cognitive Live-Arts Technology in Contemporary Game-Oriented and Accessibility Paradigms, 2010-2013; OMRAS2: A Distributed Research Environment for Music Informatics and Computational Musicology (EP/E02274X/1, 2007-2010); Yahoo! Faculty Research Award "Scaling Content-Based Search" (2007-2008), Google Faculty Research Award for content-based Multimedia algorithms "Search By Groove" (2011).

Details of the impact

Adding Value to Multi-media Collections:

We have worked with several holders of collections of multi-media content, including the BBC, to use our algorithms to make more powerful and flexible content-based search engines and interfaces. Some of this work has been done with, and for, Yahoo! and Google. As Slaney, now a Principal Research Scientist at Microsoft, writes of his time at Yahoo!:

"We set out to find technology that would leverage Yahoo's success at building web-scale search engines, and apply the ideas to music. Michael Casey and his team were instrumental to the work.... The optimal algorithm work is highly novel, game changing (I believe) and helps make LSH [Local Sensitivity Hashing] practical. LSH is now the basis of many applications, including the television discovery and identification algorithms fielded by Yahoo, and (I understand) the copyright detection algorithm used by Google." (see [1])

The Yahoo! implementation acknowledging Casey's work, is on github [2], accompanying a research paper by Yahoo! which extends the Minimum Distances algorithm [Optimal LSH by Yahoo! Researchers Slaney, Lifshits, and He, in Proceedings of the IEEE (2012)] (see [2]).

Creating tools for sensory-deprived children:

The participatory design work continues to be developed in collaboration with Whitefield School. The LumiSonic system is distributed by the arts organisation Soundandmusic.org and featured in a BBC online news article [3]. The children in the Whitefield School continue to contribute to the design of Noisebear and other systems for making music in installations and workshops. This work is in collaboration with Whitefield teacher Nicole Whitelaw and continues to this day [4].

Giving a voice to disabled musicians:

Grierson carried this work forward through an international collaboration with disabled artists, "The Dean Rodney Singers". Dean Rodney, a 21-year-old artist, rapper and musician who is also autistic, travelled with representatives from the production company Heart n Soul to seven countries to write and record an album with musicians, many of whom had disabilities. Grierson was technical lead for the project. The hardware and software drew directly on the Maximilian library and the experience and designs developed in the research trajectory described above. Sound was visualised through the extraction of low-level features. Interactive music-making devices were informed by adaptations of wheelchair controls developed in the research described above, and made more accessible through the use of sensors.

This collaboration formed the basis of an audiovisual installation at London's Royal Festival Hall from August 31st to September 9th 2012, as part of the Paralympics Unlimited festival of arts, culture and sport by deaf and disabled people. It was one of 29 commissions awarded throughout the UK, and the only digital artwork. The methodology of working with the singers to make the installation drew on the experience of Grierson's participatory design research. At the installation, members of the public created their own music tracks using the interactive technology and the sounds created by the Dean Rodney Singers and uploaded them to the Dean Rodney Singers YouTube channel. The installation featured six different interactive music stations for singing, DJ'ing, dancing and other musical activities. Over 2,000 videos were recorded in the week-long exhibition — approximately 300 per day. The Dean Rodney Singers event received extensive coverage in print and other media, including The Guardian [6], Wired [7], Metro [8], TNT [9], News Shopper [10], BBC London [11], and Lauren Laverne's `Spacepod' podcast [12].

The installation was complemented by the `Dean Rodney Singers App' for iPhone, iPod Touch and iPad, based on Grierson's `Sonic Tag' on the Apple app store. A further grant of £15,000 from Creativeworks resulted in Dean Rodney releasing a second app to promote his band's single, `Fish Water'. The app was again built with Grierson's technology and released in October 2013.

The impact of this work is not only on the performers themselves finding a very public and collective artistic voice but also on the public perception of people with disabilities and people with severe disabilities in particular. This was so important a feature of the spirit of the London Olympic and Paralympic Games.

Games and Media Companies:

Grierson — in partnership with soundandmusic.org and a games company, Roll7 — received knowledge transfer funding from the AHRC to create a closed-source music visualisation library. The library contains code that extracts low-level features from music and sound effects, and uses them to inform gameplay. This led to the development of three prototypes and one commercial game, `Ollie Ollie'. This game was commissioned by SONY entertainment, who specifically requested that it feature the audio analysis. It is due to be released on the portable PlayStation VITA in 2014 [5]. The collaboration with Roll7 also led to further funding of about £40K, from the Abertay Proof of Concept programme, for the development of sound-based games: Grierson's library analyses music in real time providing information that links the music to the game by changing the game in time and pace with the music [5].

In 2011 the co-founder of the multi-million pound company search-space and CEO of Bodymetrics visited Goldsmiths and met with Grierson for the first time. After a series of recent discussions they have formed a legal partnership (though outside the impact reporting period), including the ex-head of Universal Music Worldwide and the founding member of the Band Aid Trust, aimed specifically at exploiting the expertise developed through Grierson's research on music software and sensor technology [13].

Sources to corroborate the impact

All the material listed below is available on request from Goldsmiths' Research Office.

  1. Hard copies of correspondence available on request.
  2. Yahoo: Efficient implementation of locality-sensitive hashing (LSH) on GitHub
  3. BBC News Website Helping the deaf to `see sound' Lumisonic
  4. Teacher, Whitefield School and Centre
  5. Director, Roll7 Blog
  6. Arts head: Mark Williams, artistic director, Heart n Soul, 21 Aug 2012: 28 Aug 2012: London 2012 and disability arts: `we'll be famous for 15 minutes'
  7. Wired, 26 July 2012
  8. Metro, 1 Sept 2012
  9. TNT Magazine, 27 Aug 2012
  10. News Shopper, 6 Aug 2012
  11. BBC London, 21 Aug 2012
  12. Lauren Laverne, 4 Sept 2012
  13. Contact Director, Bodymetrics [contact details provided separately]