Multimedia: the impact of Content-based Multi-media algorithms
Submitting Institution
Goldsmiths' CollegeUnit of Assessment
Computer Science and InformaticsSummary Impact Type
SocietalResearch Subject Area(s)
Information and Computing Sciences: Artificial Intelligence and Image Processing, Information Systems
Summary of the impact
This study describes some of the impact that has come out of our
long-running research on content-based multi-media algorithms. We
enumerate three particular routes to impact: Adding value to large
archives of multimedia, giving a voice to disabled musicians,
commercialising the research through games and media companies.
Underpinning research
Michael Casey joined Goldsmiths as an SL in 2004 and has been employed
here ever since; he was promoted to Professor, and has been 0.2 fte since
2008. He joined Goldsmiths, having co-edited MPEG7 audio tools, and
started developing new algorithms in content-based multimedia information
retrieval. These have radically improved efficiency over the then standard
algorithms (from polynomial time to sub-linear time search). He used these
in developing libraries for music and multimedia content-based matching
[SoundSpotter, AudioDB, Bregman, ACTION].
Mick Grierson is a Senior Lecturer: he was appointed as a 0.5 fte
Lecturer in 2008 and was made full-time in 2010. He and Casey started
working together in 2008, creating real-time interactive audiovisual
software systems for sound, music and image analysis [1]. Grierson later
used this work as a basis for his own libraries for real-time audiovisual
interaction. Here we describe two strands of research that led to very
different kinds of impact, both stemming from this work on multimedia
algorithms: content-based search and participant design with disabled
communities.
Content-based Search: Content-based search methods, such as
locality sensitive hashing (LSH), have many free parameters and are
difficult to optimise. Casey, sponsored by Yahoo! research, developed
algorithms through a new Minimum Distances approach that adapts
the nearest-neighbour strategy by limiting the search space using an
optimal radius bound determined by sampling of a dataset. The method
efficiently solves a range of common retrieval tasks by providing optimal
parameters that can be used in low-distortion embedding schemes such as
locality-sensitive hashing (LSH). [2] This leads to an easy implementation
of multimedia search with retrieval times several orders of magnitude
faster than those using exhaustive distance computations or non-optimal
LSH parameters, [3]. Casey's method is compatible with most existing
multimedia features and is used for big data content-based search
optimisation in industrial scale applications.
Participatory Research with Disabled Children: Grierson
extended the work described in [1] in collaboration with Whitefield
School and Centre, the largest Special Educational Needs (SEN)
school in Europe. He worked with children with multiple disabilities
(including deafness, motor difficulties and autism) to develop a system
called LumiSonic that enabled deaf children to interact, by way of
wireless mobile sensors, with sound and music using visualisations of
audio features [4]. To support quick cycles of software prototyping,
testing and improvement, Grierson developed a library, called Maximilian,
of C++ rapid prototyping tools for real-time signals analysis,
visualisation and digital signal processing. [5] This allowed software to
be developed quickly, working with disabled children. Children were given
prototypes and encouraged to engage in specific tasks, evolving from
discrete interactions (such as recording and playing back a sound) to more
complex interaction involving continuous fine motor control (e.g. for
sound manipulation and adjustment).
Through a series of iterations with users, Grierson designed systems and
devices that were usable by the entire disabled user group. For example,
while touchscreen-based systems were fine for some, they proved
frustrating for users with severely restricted mobility or cognitive
disabilities. For these children, iOS device sensors interpreted movement
data directly from participants' existing wheelchair interfaces. For
children with cognitive disabilities, he created a tactile human computer
interface, called a Noisebear: a bespoke, wireless, low-voltage
pressure sensitive mass that uses machine learning techniques to adapt to
the individual users' needs [6]. As far as the children are concerned it
is a cuddly toy which they can move and manipulate to control the sound
and music.
References to the research
The international quality of the research is evidenced through the
publication of results in highly-regarded rigorous journals ([3], for
example, is published in the journal ranked 2nd in its field)
and the quality of the software ([5] is an output in this REF submission)
1. Casey, M. & Grierson, M. "Soundspotter / Remix-TV: Fast
Approximate Matching for Audio and Video Performance", in Proceedings
of the International Computer Music Conference, Copenhagen, Denmark,
2007. [Available online
or from the research office]
2. Slaney M & Casey, M., Locality-sensitive hashing for finding
nearest neighbors, IEEE Signal Processing Magazine, 25 (2), pp.
128-131, March 2008. DOI:
10.1109/MSP.2007.914237
3. Casey, M., Rhodes, C. and Slaney. (2008). M. Analysis of Minimum
Distances in High-Dimensional Musical Spaces, IEEE Transactions on
Audio, Speech and Language Processing, Vol. 16 No. 5 pp. 1015-1028.
DOI:
10.1109/TASL.2008.925883
4. Grierson M. `Making Music with Images: Interactive Audiovisual
Performance Systems for the Deaf', International Journal on Disability
and Human Development. Volume 10, Issue 1, Pages 37-41, 2011.
DOI: 10.1515/ijdhd.2011.009.
5. First described in Grierson M & Kiefer C (2011) Maximillian:
An easy to use, cross platform C++ Toolkit for interactive audio and
synthesis applications." In Proceedings of the International Computer
Music Conference, Huddersfield, UK. [REF output; details in REF 2b]
6. Grierson M. & Kiefer C (2013) NoiseBear: A Wireless Malleable
Multiparametric Controller for use in Assistive Technology Contexts. In CHI
'13 Extended Abstracts on Human Factors in Computing Systems (CHI EA
'13), 2923-2926. ACM, NY, US. DOI:
10.1145/2468356.2479575.
Funded Research Projects that relate to the underpinning research:
AHRC grant AH/D000602/1 Cognitive and Structural Approaches to
Contemporary Audiovisual Computer Aided Composition, 2006-2010; AHRC grant
AH/H038264/1 Sound, Image and the Brain: Cognitive Live-Arts Technology in
Contemporary Game-Oriented and Accessibility Paradigms, 2010-2013; OMRAS2:
A Distributed Research Environment for Music Informatics and Computational
Musicology (EP/E02274X/1, 2007-2010); Yahoo! Faculty Research Award
"Scaling Content-Based Search" (2007-2008), Google Faculty Research Award
for content-based Multimedia algorithms "Search By Groove" (2011).
Details of the impact
Adding Value to Multi-media Collections:
We have worked with several holders of collections of multi-media
content, including the BBC, to use our algorithms to make more powerful
and flexible content-based search engines and interfaces. Some of this
work has been done with, and for, Yahoo! and Google. As Slaney, now a
Principal Research Scientist at Microsoft, writes of his time at Yahoo!:
"We set out to find technology that would leverage Yahoo's success at
building web-scale search engines, and apply the ideas to music. Michael
Casey and his team were instrumental to the work.... The optimal
algorithm work is highly novel, game changing (I believe) and helps make
LSH [Local Sensitivity Hashing] practical. LSH is now the basis of many
applications, including the television discovery and identification
algorithms fielded by Yahoo, and (I understand) the copyright detection
algorithm used by Google." (see [1])
The Yahoo! implementation acknowledging Casey's work, is on github
[2], accompanying a research paper by Yahoo! which extends the Minimum
Distances algorithm [Optimal LSH by Yahoo! Researchers Slaney, Lifshits,
and He, in Proceedings of the IEEE (2012)] (see [2]).
Creating tools for sensory-deprived children:
The participatory design work continues to be developed in collaboration
with Whitefield School. The LumiSonic system is distributed by the arts
organisation Soundandmusic.org and featured in a BBC online news article
[3]. The children in the Whitefield School continue to contribute to the
design of Noisebear and other systems for making music in installations
and workshops. This work is in collaboration with Whitefield teacher
Nicole Whitelaw and continues to this day [4].
Giving a voice to disabled musicians:
Grierson carried this work forward through an international collaboration
with disabled artists, "The Dean Rodney Singers". Dean Rodney, a
21-year-old artist, rapper and musician who is also autistic, travelled
with representatives from the production company Heart n Soul to
seven countries to write and record an album with musicians, many of whom
had disabilities. Grierson was technical lead for the project. The
hardware and software drew directly on the Maximilian library and the
experience and designs developed in the research trajectory described
above. Sound was visualised through the extraction of low-level features.
Interactive music-making devices were informed by adaptations of
wheelchair controls developed in the research described above, and made
more accessible through the use of sensors.
This collaboration formed the basis of an audiovisual installation at
London's Royal Festival Hall from August 31st to September 9th 2012, as
part of the Paralympics Unlimited festival of arts, culture and sport by
deaf and disabled people. It was one of 29 commissions awarded throughout
the UK, and the only digital artwork. The methodology of working with the
singers to make the installation drew on the experience of Grierson's
participatory design research. At the installation, members of the public
created their own music tracks using the interactive technology and the
sounds created by the Dean Rodney Singers and uploaded them to the Dean
Rodney Singers YouTube channel. The installation featured six different
interactive music stations for singing, DJ'ing, dancing and other musical
activities. Over 2,000 videos were recorded in the week-long exhibition —
approximately 300 per day. The Dean Rodney Singers event received
extensive coverage in print and other media, including The Guardian [6],
Wired [7], Metro [8], TNT [9], News Shopper [10], BBC London [11], and
Lauren Laverne's `Spacepod' podcast [12].
The installation was complemented by the `Dean Rodney Singers App' for
iPhone, iPod Touch and iPad, based on Grierson's `Sonic Tag' on the Apple
app store. A further grant of £15,000 from Creativeworks resulted in Dean
Rodney releasing a second app to promote his band's single, `Fish Water'.
The app was again built with Grierson's technology and released in October
2013.
The impact of this work is not only on the performers themselves finding
a very public and collective artistic voice but also on the public
perception of people with disabilities and people with severe disabilities
in particular. This was so important a feature of the spirit of the London
Olympic and Paralympic Games.
Games and Media Companies:
Grierson — in partnership with soundandmusic.org and a games
company, Roll7 — received knowledge transfer funding from the AHRC to
create a closed-source music visualisation library. The library contains
code that extracts low-level features from music and sound effects, and
uses them to inform gameplay. This led to the development of three
prototypes and one commercial game, `Ollie Ollie'. This game was
commissioned by SONY entertainment, who specifically requested that it
feature the audio analysis. It is due to be released on the portable
PlayStation VITA in 2014 [5]. The collaboration with Roll7 also led to
further funding of about £40K, from the Abertay Proof of Concept
programme, for the development of sound-based games: Grierson's library
analyses music in real time providing information that links the music to
the game by changing the game in time and pace with the music [5].
In 2011 the co-founder of the multi-million pound company search-space
and CEO of Bodymetrics visited Goldsmiths and met with Grierson
for the first time. After a series of recent discussions they have formed
a legal partnership (though outside the impact reporting period),
including the ex-head of Universal Music Worldwide and the founding member
of the Band Aid Trust, aimed specifically at exploiting the expertise
developed through Grierson's research on music software and sensor
technology [13].
Sources to corroborate the impact
All the material listed below is available on request from Goldsmiths'
Research Office.
- Hard copies of correspondence available on request.
- Yahoo: Efficient
implementation of locality-sensitive hashing (LSH) on GitHub
- BBC News Website Helping
the deaf to `see sound' Lumisonic
- Teacher, Whitefield School and Centre
- Director, Roll7 Blog
- Arts
head: Mark Williams, artistic director, Heart n Soul, 21 Aug 2012:
28 Aug 2012: London
2012 and disability arts: `we'll be famous for 15 minutes'
-
Wired,
26 July 2012
-
Metro,
1 Sept 2012
-
TNT
Magazine, 27 Aug 2012
-
News
Shopper, 6 Aug 2012
-
BBC London, 21
Aug 2012
-
Lauren Laverne, 4
Sept 2012
- Contact Director, Bodymetrics [contact details provided separately]