New tools to study complex data sets
Submitting Institution
King's College LondonUnit of Assessment
Mathematical SciencesSummary Impact Type
TechnologicalResearch Subject Area(s)
Mathematical Sciences: Statistics
Information and Computing Sciences: Computation Theory and Mathematics
Economics: Econometrics
Summary of the impact
Research of Tiziana Di Matteo on network-based filtering techniques has
lead to powerful new tools for the characterization of dependencies in
large complex data sets. This has generated impact on practitioners and
professional services in the biotechnology industry and with financial
regulators. The Swiss biotechnology firm THERAMetrics Holding AG has used
Di Matteo's techniques for developing a quantitative methodology to
validate their knowledge based research platform for drug repositioning
research. Within a consultancy project awarded to her by the Financial
Services Authority (FSA), the information filtering techniques where used
to provided advice on methodological correctness of Econophysics
techniques applied to a market cleanliness event study.
Underpinning research
Tiziana Di Matteo's (TDM) main research area is Complex Systems and
Econophysics, including the application of methods from statistical
physics and network theory to economic modelling, and the analysis of
financial markets and social problems.
She was, in particular, the first scientist to propose analysing complex
financial datasets (correlation and autocorrelation matrices of interest
rates and stock market indices) from the perspective of geometrical and
topological properties of metric graphs, embedded in spaces of appropriate
dimensions and curvature.
This proposal addresses a problem one generally faces when observing the
behaviour of large complex systems, namely that relevant features in such
systems are typically both local and global, and that these
different levels of organization emerge at different scales in a way that
is intrinsically not reducible. It is therefore essential to detect
clusters together with the different hierarchical patterns of dependencies
both above and below the cluster levels. Graph embedding techniques have
provided efficient tools to solve this task for a wide variety of complex
systems, including financial and biological systems.
Specifically, in the last few years, TDM and collaborators have been
focusing on the dynamical characterization of correlated financial
data in terms of graphs [1,2], studies that, apart from their scientific
interest, are very relevant to risk estimation and portfolio selection.
One of the main research interests addressed in these studies concerns the
extraction of meaningful information concerning the behaviour and
interactions between variables describing a system under study, often from
data sets containing a high level of redundancy, and to use such
information to model and forecast their collective evolution. The
main methodological outcome of these studies has been the discovery of a
new method to filter information out of complex datasets [5]. Recent
developments have shown that this approach can be used to extract clusters
and hierarchies from high-dimensional complex data sets in an unsupervised
and deterministic manner, without the use of any prior information
[3,4]. This network-based approach to information filtering has opened new
ways to study financial systems and also several other fields where a
large number of interrelated variables are concerned, such as inference in
biomedicine [6].
Key Researchers
1. Dr Tiziana Di Matteo
- King's College, since 01/09, Reader in Financial Mathematics
2. Ruggero Gramatica
- King's College London, since 01/10, PhD Student (PT)
- 04/10 - 09/13. CEO of MondoBiotech AG (now THERAMetrics Holding AG)
References to the research
1. T. Aste, W. Shaw, T. Di Matteo, "Correlation structure and dynamics in
volatile markets", New J. Phys. 12 (2010) 085009.
DOI:10.1088/1367-2630/12/8/085009.
2. T. Di Matteo, F. Pozzi, T. Aste, "The use of dynamical networks to
detect the hierarchical organization of financial market sectors", The
European Physical Journal B 73 (2010) 3-11. DOI:
10.1140/epjb/e2009-00286-0
3. Won-Min Song, T. Di Matteo, T. Aste, "Nested hierarchies in planar
graphs", Discrete Applied Mathematics 159 (2011) 2135-2146. DOI:
10.1016/j.dam.2011.07.018.
4. * Won-Min Song, T. Di Matteo, T. Aste, "Hierarchical information
clustering by means of topologically embedded graphs", PLoS One 7(3)
(2012) e31929. DOI: 10.1371/journal.pone.0031929.
5. * T. Aste, Ruggero Gramatica, T. Di Matteo, "Exploring complex
networks via topological embedding on surfaces", Physical Review E
86 (2012) 036109. DOI: 10.1103/PhysRevE.86.036109.
6. * R. Gramatica, D. Bevec, T. Di Matteo, M. Barbiani, S. Giorgetti and
T. Aste, "Graph theory enables drug repurposing - How a mathematical model
can drive the discovery of hidden Mechanisms of Action", submitted to Plos
One (2013); available at
http://arxiv.org/abs/1306.0924.
Articles marked with an asterisk best indicate the quality of the
underpinning research.
Details of the impact
The research described in section 2 has generated two instances of
impact. It has been used by a Swiss biotechnology company to validate
principles underlying the construction of a knowledge based approach that
allowed the discovery of patterns connecting a certain set of peptides
with the occurrence of a set of rare diseases, and it has led to
consultancy work done for the Financial Services Authority (FSA) in a
drive by the FSA to improve their tool-kit used to carry out market
cleanliness event studies.
TDM's consultancy work for the FSA can be understood as a consequence of
the fact that she is acknowledged as one of the leading experts in
Econophysics, the theory of complex systems, the analysis of financial
markets using techniques of statistical physics, network theory and
numerical methods, and in particular on the specific information filtering
techniques she developed since joining KCL.
The recent financial crisis has led financial institutions to rethink the
proper methodologies implemented at that time. In this context, TDM was
approached by the Financial Services Authority (FSA) in 2010 to provide
advice for a project to strengthen their toolbox used in so called market
cleanliness event studies, specifically to test whether new techniques
from Econophysics could help to improve the accuracy and diagnostic power
of such studies. The cleanliness of markets is important for London as a
financial centre. Therefore the FSA undertakes and publishes market
cleanliness studies annually, and provides a measure indicative of the
level of suspicious trading activity (insider trading) in the London stock
market by detecting anomalous trading and price- movement patterns which
occur ahead of the release of important information, such as announcements
of takeovers or regulatory changes.
TDM's work on network-based filtering has been used by the FSA to
cross-validate the analysis of a set of financial events about its
cleanliness. The work done by TDM within a consultancy project at the FSA
was to provide advice on the methodological correctness and suitability of
Econophysics techniques applied to such market cleanliness event studies;
this included advice on coding and interpretation of results. Feedback
received from the FSA suggests that TDM's contribution to the project was
regarded as very valuable in enhancing the FSA's methodical awareness, and
that TDM's network based filtering techniques in particular could be used
to refine the FSA standard market cleanliness indicator.
A second instance of impact has been generated in the biotechnology
industry. In 2009, Ruggero Gramatica (RG) contacted KCL's Financial
Mathematics group, to study towards a PhD in Econophysics under the
supervision of TDM. After joining KCL in 2010, RG was appointed CEO of the
Swiss biotechnology company mondoBIOTECH AG, now THERAMetrics Holding AG,
and he quickly realized that the network-related tools and techniques
pioneered by TDM and co-workers could be generalized and fruitfully
applied to the data-analysis problems of concern to his company, dealing
with the discovery of drugs for rare diseases via a knowledge based
process of repurposing already existing drugs.
Specifically, THERAMetrics was looking for an inferential methodology
that could validate their line of research, which deals with automatically
extracting bio-medical information on human physiology provided in
published works of biochemists or physicians, which would allow the
discovery of new Mechanisms of Action (MoA). While biochemists will refer
to proteins, receptors, genes and biochemical processes, physicians and
health practitioners will mention symptoms, clinical tests, diseases, body
organs, tissues, and drugs available for treatment. The central task is to
combine such unstructured and dispersed information in a manner that
allows relating, for instance, information about biochemical processes
with diseases, symptoms and treatments discussed by clinicians.
More than 10 man-years of research were invested at THERAMetrics,
starting from the original model proposed by RG, where a knowledge based
graph derived from more than 3 million scientific publications, and made
of hundreds of thousand of nodes with a very dense set of correlated
information is provided, with the aim to search for the non-obvious paths
connecting certain molecules to certain diseases, stepping through a
number of biological pathways (i.e. new MOAs). TDM's research and
expertise, described in section 2, was instrumental to extract such
emerging patterns and create new bio-mathematical tools. Using these
tools, RG and the scientific team of THERAMetrics have been able to
validate a number of the molecules-disease relations that had been present
in their candidate pipeline, and in so doing were able to reinforce the
scientific foundations of THERAMetrics' drug-discovery platform. Indeed,
RG has meanwhile filed IP protection for the general semantic and
mathematical model underlying this research.
THERAMetrics Holding AG has been loss-making in the past. However,
amongst other factors related to the Company's restructuring plan, the
successful results obtained by the above-mentioned methodology helped the
company to outline a proper research platform which became a valuable
asset in the recent business combination realized by means of a reverse
merger with Pierrel's Contract Research International with THERAMetrics
Holding AG adding an innovative element in the drug rescuing and
repurposing strategy. The takeover/merger was concluded in September 2013
and thanks to such business combination THERAMetrics Holding AG has
significantly increased its market capitalization.
Sources to corroborate the impact
Financial/economic background concerning THERAMetrics including
information about details of the corporate takeover/merger with Pierrel
SpA in 2013 at
http://www.therametrics.com/investor/investors/key-information
KCL mirror of THERAMetrics site
Personal sources to corroborate impact at THERAMetrics Holding AG
- Former CEO of THERAMetrics Holding AG (testimonial received and
available on request).
- Chief Scientific Officer at THERAMetrics Holding AG
Personal source to corroborate impact at the Financial Services Authority
(FSA), now Financial Conduct Authority (FCA)
- Manager, Economics of Financial Regulation, FSA (testimonial received
and available on request)