Data-driven Decision Support
Submitting Institution
Robert Gordon UniversityUnit of Assessment
Computer Science and InformaticsSummary Impact Type
TechnologicalResearch Subject Area(s)
Mathematical Sciences: Numerical and Computational Mathematics, Statistics
Information and Computing Sciences: Artificial Intelligence and Image Processing
Summary of the impact
Many organisations rely on increasingly large and complex datasets to
inform operational decision-
making. To assist decision-makers when decisions are data-driven,
computational tools are
needed that present reliable summary information and suggest options
allied to the key objectives
of decision-making. Research at RGU has developed novel learning and
optimisation algorithms
driven by multifactorial data and implemented this in commercial
decision-support software. The
research has had economic impact by providing products to be sold:
drilling rig selection tool
(ODS-Petrodata Ltd.) and subsea hydraulics diagnostic tool (Viper Subsea
Ltd.). Further economic
impact comes from operations management software developed for British
Telecom.
Underpinning research
Our research expertise in machine learning for complex decision support
dates from 1999 when
we first applied evolutionary algorithms to optimise chemotherapy
treatments using data-validated
tumour simulations [R1]. As decision-making has become increasingly
data-driven, we have
researched approaches that automatically detect interactions between
decision variables in
multifactorial datasets that are pertinent to key outcome variables. Such
relationships when
discovered can be presented to decision-makers directly and/or combined
with methods that
automatically suggest one or more decisions for consideration. In much of
our work this sort of
relationship may only be implicitly derived from running a simulation
based on the decision
variables or by analysing large historical datasets containing both
decision and outcome variables.
An example is the relationship between the selection, intensity and
phasing of anti-cancer drugs
and the outcomes in terms of tumour reduction, patient quality of life and
the severity of toxic side
effects. [R2] was the first work to apply multi-objective
naturally-inspired algorithms combined with
tumour response simulations and clinical trials data to support clinicians
in developing improved
cancer chemotherapies. All of these approaches are computationally
intensive and complexity
grows exponentially with dataset size, number of decision variables and
number of choices per
decision. This is problematic because real-world decision-support systems
must be responsive to
decision-makers and be implemented in a typical industrial IT environment.
Our research has therefore sought computationally cheap methods of
data-driven decision-
support, developing state-of-the-art learning and multi-objective
optimisation algorithms that work
with simulations and datasets and evaluating them against comparable
techniques on theoretical
and on real world problems. There are two main approaches in our work.
search efficiency: we build cheap but inaccurate data models
during optimisation to quickly focus
the search on feasible, near-optimal regions of the search space. More
expensive and accurate
computation can then be focussed on a small number of highly promising
solutions. In 2007 we
introduced the Chain-Model Genetic Algorithm (ChainGA) for Bayesian
Network Structure Learning
[R3]. Significant speed-up was achieved using simple chain models to
quickly evaluate candidate
solutions. Comparisons with a classical algorithm (K2GA) showed that the
benefits of ChainGA
were highly problem-dependent. Further research has reinforced this
finding and identified how
search space and algorithm features interact to affect search efficency
[R4].
search accuracy: We use sophisticated learning to greatly reduce
computational effort through
needing fewer solution evaluations. This is particularly appropriate where
the costs of model
building are dwarfed in comparison to the costs of evaluating solutions.
The DEUM algorithm
builds a Markov network model of the relationship between solutions and
fitness during an
optimisation. In 2007, we compared DEUM against other leading EDAs [R5]
showing that it
significant reduced the number of function evaluations needed for
optimisation at the expense of
computational cost. More recent work (2008 onwards) has focussed on
analysing the number of
evaluations needed in relation to known or discovered problem structure.
Precise estimates of the
number of evaluations required have now been shown to hold valid for a
wide range of theoretical
and applied problems [R6]. Therefore robust estimates can be made of the
computational cost
trade-offs between solution evaluation and model-building for new
applications based on the
nature and number of the variables involved and simulation run-times.
Key researchers:
John McCall (Lecturer > Senior Lecturer > Reader > Professor
(1992 - date)
Andrei Petrovski Lecturer > Senior Lecturer > Reader (1998 - date)
Siddhartha Shakya PhD Student > Research Fellow (2003-7) > Research
Manager, British
Telecom Innovate and Design (2007 - date)
References to the research
* indicates key reference.
[R1] J. McCall, A. Petrovski (1999), A Decision Support System for
Chemotherapy using Genetic
Algorithms. In M. Mouhammadian (Ed.) Computational Intelligence
for Modelling, Control and
Automation, pp 65-70, IOS Press, 1999. [32 Google Scholar citations
(10 self)]
* [R2] Andrei Petrovski and John McCall (2001), Multi-objective
Optimisation of Cancer
Chemotherapy Using Evolutionary Algorithms, in E.Zitzler et al.
(Eds.): Evolutionary Multi-Criterion
Optimization, Lecture Notes in Computer Science 1993,
pp531-545, Springer-Verlag ISBN: 3-540-
41745-1, ISSN:0302-9743. [33 Google Scholar citations (11 self)]
[R3] Ratiba Kabli, Frank Herrmann, John McCall, A Chain-Model Genetic
Algorithm for Bayesian
Network Structure Learning, in GECCO 2007 Volume II pp1264-1271. [22
Google Scholar citations
(12 self)]
[R4] Wu, Y., McCall, J., & Corne, D. (2010). Two novel Ant Colony
Optimization approaches for
Bayesian network structure learning. In Evolutionary Computation (CEC),
2010 IEEE Congress on
(pp. 1-7). IEEE. [15 Google Scholar citations (7 self)]
* [R5] Siddhartha Shakya, John McCall (2007), Optimization by
Estimation of Distribution with
DEUM Framework Based on Markov Random Fields Int. Jnl of Automation
and Computing 04 (3),
262-272 DOI: 10.1007/s11633-007-0262-6 [53 Google Scholar citations (16
self)]
* [R6] Alexander E.I. Brownlee, John A.W. McCall, Siddartha K. Shakya and
Qingfu Zhang (2009),
Structure Learning and Optimisation in a Markov-network based Estimation
of Distribution
Algorithm, in Proceedings of the Eleventh IEEE on Congress on Evolutionary
Computation, pp447-
454, 2009. [27 Google citations (10 self)]
Details of the impact
Pathway to Impact
Decision support technologies have a complex pathway to impact. We
operate in areas where
complex decision-making controls the use of large assets or the deployment
of workforces. Here
operational expenditure is large and so are the consequences of
inefficient decision-making.
Therefore the process of adopting innovation in decision-making policy is
multi-staged and
conservative. Some large companies innovate in-house — an example is
British Telecom. Others,
such as oil and gas majors, outsource innovation to technology support
companies, which are
often SMEs. Examples of such support companies here are Viper Subsea and
ODS Petrodata.
Decision-support technologies are therefore developed through stages of
"technology readiness"
before ultimate internal adoption or external purchase by a large company.
There is therefore a
value chain of technology adoption. As an academic research group, we
operate at the start of the
value chain by translating advanced knowledge in optimisation and learning
into software
components that address a decision-making problem identified by a company
as an area for
improvement. We essentially operate as an outsourced part of these
companies' R&D functions
and add value by proving concepts, quantifying potential efficiencies and
encapsulating specialist
know-how in software components around which decision-support products can
be developed and
sold. Our main economic impact therefore is in creating this value. It is
measured by proven
efficiencies in decision-making processes and the potential sales value
this creates for the
companies, or, in the case of a large company, internal adoption for use
in high-value decision-
making. Typically we operate by embedding our research students with
partner companies and
often this leads to employment to support development through the next
stages of the value chain.
Therefore, a secondary economic impact is the transfer of high-skilled
individuals into companies.
Reach
In this case study we explain how our research in data-driven decision
support has impacted one
large company, British Telecom, and two SMEs, ODS Petrodata and Viper
Subsea. In each case,
the interaction has resulted in value-add in terms of decision-support
software and the long-term
employment of research students by the company. We discuss specific
aspects of the pathway
and significance for each company in turn.
Operations Management Decision Support at British Telecom
Context
Workforce Dynamics Simulator (WDS) is a dynamic business simulation
environment, developed
at British Telecom, which enables large scale simulation of workforces and
notable resources. It
uses historical or generated data to investigate the execution of work
plans. WDS takes detailed
input data on geographical locations, tasks to be completed and engineers
available, typically over
a 90-day planning period and involving around 25,000 workers across the
UK. The system
simulates a variety of scenarios and the output reports are used to
support decision making in the
business. A key problem is that WDS is sensitive to choice of input
parameters. These must be
tuned to ensure that the predictions in the output report are as accurate
as possible.
Pathway
In 2010-11, BT funded a project at RGU supervised by McCall, to
apply metaheuristic optimization
algorithms to carry out the tuning of WDS. WDS simulations are expensive
and accuracy is
important so search accuracy was a target here. A variety of
different algorithms were evaluated,
including the DEUM algorithm. The decision to investigate the use of DEUM
in this case was
motivated by the high computational cost incurred with each simulation. We
packaged a suite of
algorithms, including DEUM, as a software component called the WDS Tuning
Tool (WDSTT).
Impact and Significance
WDSTT has been adopted by BT, replacing a parameter sweep of tuning
parameters for WDS.
WDSTT adds value in up to 5% more accurate field workforce simulations as
well as reduced
tuning times (up to a factor of 10). Since the incorporation of WDSTT, the
simulation software has
been used repeatedly in BT to support management decision-making in
assessing the effect of
changes in different business scenarios and transformation initiatives
including: the effect of new
product introduction; the effect of special events on demand; and support
for IT investment
decisions [I1]. It is impossible to quantify financially the effect of
more accurate simulation on the
quality of decisions. However the decisions involved correspond to
investment of multi-million
pound amounts in the deployment and organisation of over 20,000 field
workforce employees, so
small percentage changes in accuracy translate into significant financial
and operational value.
McCall's former research student (Shakya) is permanently
employed as a research manager at
British Telecom. While at BT, he has applied DEUM to a dynamic pricing
problem [I2].
Subsea Hydraulic Control Diagnostics at Viper Subsea
Viper Subsea (VS) are an engineering company offering specialised
consultancy support for
hydraulic control systems for offshore oil and gas production. Offshore
drilling installations consist
of an interconnected set of pipes, pumps and valves used to collect
hydrocarbons from different
wells for onward processing. Pumps and valves are controlled hydraulically
through hydraulic
controls. Due to the nature of hydrocarbon flow, pipes pumps and valves
can become clogged or
stuck and so performance changes over time. Shutdowns in production cost
millions of dollars per
day and so it is important to correctly diagnose faults in the system when
they occur, either to
avoid or to minimise and to accurately target shutdowns. Viper Subsea are
therefore interested in
developing diagnostic software for sale to operators of offshore
installations.
Pathway
VS and Technology Strategy Board (TSB) funded a two-year KTP project
(2011 - 13), supervised
by Petrovski and McCall, to develop algorithms for pro-active fault
diagnosis using control
feedback data from the real-time operation of subsea hydraulic systems. As
the algorithms need to
be accurate but also run on limited capacity machines in robust
environments search efficiency
and search accuracy are both key. A wide range of learning algorithms,
including Bayesian
Network learning [R3, R4], was evaluated on test problems, using test data
supplied by a major
operator. A diagnostic software component was packaged in a prototype
product V-Sentinel, which
can, within a few seconds, identify abnormalities in data streams from 14
sensors, characterising
the level of hydrolic liquid in a tank, quality of insulation, and
validating pressure sensors.
Impact and Significance
In a final report on the project, required by TSB, VS valued V-Sentinel
as having a profit-generating
capacity of £165K per annum [I3]. It is reasonable to value an asset at
five times its annual profit-
generating capacity, in this case £825K. VS are actively developing
V-Sentinel and the research
student is now permanently employed by the company for that purpose.
Drilling Rig Market Data Modelling at ODS Petrodata
Context
ODS-Petrodata Ltd. (ODSP) are an Aberdeen-based SME specialising in the
provision of oil and
gas market data. ODSP's market data and analyses are made available to the
oil and gas market
through a number of web products including RigPoint, which oil and gas
majors and large financial
investors use to inform their selection of offshore drilling rigs for
particular jobs. The drilling
performance of a given rig depends on a large number of factors relating
to rig specification, drilling
location and water depth Given that each rig-hire costs several hundreds
of thousands of dollars
per day, the potential benefits of reduced drilling time by a few days on
average, realised across
the hundreds of annual hires in the global market, run to $10Ms per annum.
ODSP were therefore
interested in adding value to their RigPoint product with a drilling rig
selection tool, able to inform
choice by predicting time to drilling depth for selected rigs.
Pathway
ODSP and TSB funded a three-year KTP project (2009 - 12), supervised by McCall
and
Petrovski, focussed on integrating and modelling two large datasets
containing over one hundred
and fifty data fields and fifty thousand records of drilling rig and well
data. The size of the datasets
and consequently modelling complexity necessitates search efficiency.
Chain-model algorithms
[R3, R4] were used to produce Bayesian Network models relating the data
factors. These models
were incorporated in a software tool to predict the drilling performance
of particular rigs given a
range of over forty possible input factors that would be known to decision
makers, including drilling
location and depth. The software is capable of inferring factors that are
not given.
Impact and Significance
In the final report on the project, required by TSB, ODSP valued the
software component as having
a profit-generating capacity of £1.35M per annum [I4], so an estimated
value of £6.75M. As the
project was nearing completion, ODSP was acquired by a large US data
company IHS Ltd. IHS
committed to continuing development of the rig efficiency software [I4]
and the research student
was taken on as a permanent employee.
Sources to corroborate the impact
[I1] Letter of corroboration from Head of Resource Management
Technologies Research,
British Telecom Innovate and Design.
[I2] Shakya, Siddhartha, Fernando Oliveira, and Gilbert Owusu. "Analysing
the effect of
demand uncertainty in dynamic pricing with eas." Research and
Development in Intelligent
Systems XXV. Springer London, 2009. 77-90.
[I3] Partners Final Report Form: Knowledge Transfer Partnership
KTP008580.
[I4] Partners Final Report Form: Knowledge Transfer Partnership
KTP006922.