### Economical Experiments for the Fuel Efficiency Industry

**Submitting Institution**

Queen Mary, University of London**Unit of Assessment**

Mathematical Sciences**Summary Impact Type**

Economic**Research Subject Area(s)**

Mathematical Sciences: Statistics

Economics: Applied Economics, Econometrics

**Download original**

PDF**Summary of the impact**

The petrochemical industry is eager to develop advanced fuels which improve fuel efficiency both for economic and environmental reasons. Statistics plays a crucial role in this costly process. Innovative Bayesian methodology developed by Gilmour was applied at Shell Global Solutions to data from fuel experiments to solve a recurring statistical problem. The usefulness of this approach to the wider petrochemical industry has been recognized by the industry-based Coordinating European Council (CEC) for the Development of Performance Tests for Fuels, Lubricants and other Fluids, who in their statistics manual have included Gilmour's method as an alternative to procedures in the ISO 5725 standard.

**Underpinning research**

The underpinning research and methodology was developed by Gilmour, Reader (then Professor) in Statistics at Queen Mary from 2000 until 2010. During this period he made important contributions to the design and analysis of industrial experiments and developed the advanced Bayesian statistical methods used at Shell and by participating CEC laboratories.

Since the mid-1990s there has been an increasing realisation that many industrial factorial experiments involve some factors whose levels are more difficult to reset than others and this leads naturally to the use of multi-stratum designs, such as nonorthogonal split-plot designs. These imply the use of linear mixed models, with random error terms corresponding to each stratum in the design, and the experiments must be designed with such models in mind. The first paper [3] to deal directly with designing such experiments, rather than adapting existing designs to the required structures, was by Trinca and Gilmour.

As well as developing the methodology, Gilmour was closely involved in applying it to specific experiments, e.g. a food processing experiment which investigated the effects of five process factors on the drying rate and the retention of several volatile compounds for freeze-dried coffee [4]. One of the factors, the pressure, could only be reset once per day, while the other factors could be reset for five runs each day. While the research on multi-stratum designs provided a useful and informative design for this experiment, the data analysis had one surprising feature. The state-of- the-art method of analysis for data from multi-stratum designs involves estimating the random effects by residual maximum likelihood (REML) and the fixed effects by empirical generalized least squares (GLS). In the freeze dried coffee experiment, the variance component for days was estimated to be zero. One consequence of this is that the standard errors of the effects of pressure are estimated to be the same as they would have been if the pressure had been reset for every run, which might be too optimistic.

The estimation of variance components for high strata to be zero was found in several other data sets from multi-stratum designs and in 2006, Goos published results showing that there was a high probability of obtaining zero estimates, even if the true value was substantially greater than zero. In many cases such estimates are unrealistic and arise due to a flat likelihood in designs which are rather uninformative about these variance components. Gilmour and Goos obtained Royal Society international joint project grant 2007/R2 to study the problem further.

The specific research that underpins the impact is published in the paper [1] by Gilmour and Goos, combining ideas from the analysis of saturated designs [2] with the mixed models used for analysing data from multi-stratum designs. Paper [1] details a general method of statistical analysis for multi-stratum designs that incorporates prior information into the experimental analysis when estimating variance components. Gilmour and Goos compared their Bayesian approach to the commonly-used REML-GLS for non-orthogonal split-plot designs. They discovered that REML- GLS estimation in non-orthogonal split-plot designs with few main plots gives misleading conclusions, whereas their Bayesian approach led to a more informative analysis.

The proposed method uses Bayesian analysis [1, 2] incorporating a prior distribution for the main plot variance component, which in the round robin experiments of fuels considered in the impact case corresponds to the laboratory-to-laboratory variance component. The method is implemented by using Markov chain Monte Carlo sampling [1] using the WinBUGS software and SAS code. With this Bayesian approach informative priors will outweigh uninformative data, and informative data will outweigh uninformative priors. As a consequence, negative and zero estimates of variance components can be avoided.

The design and analysis of multi-stratum experiments has become one of the most vibrant areas of research in the design and analysis of experiments. In addition to introducing the Bayesian analysis, Gilmour and Goos outlined a general strategy for choosing an appropriate model to analyse data from multi-stratum designs in a paper [5] which won the 2012 American Statistical Association Award for Statistics in Chemistry. Meanwhile, Gilmour was PI on Engineering and Physical Sciences Research Council research grant EP/C541715/1, `Unifying approaches to design of experiments' (£435,000) from 2005-2010, one of the themes of which was on multi- stratum designs. This lead, amongst other publication to an RSS discussion paper [6], which introduced the idea of designing experiments specifically to ensure good and robust estimation of the required variance components.

**References to the research**

1. S. G. Gilmour and P. Goos (2009) `Analysis of data from nonorthogonal
multi-stratum designs in industrial experiments'. *Journal of the Royal
Statistical Society*, Series C, 58, 467-484.

2. M. Y. Baba and S. G. Gilmour (2007) `Bayesian estimation from
saturated factorial designs'. In *Bayesian Process Monitoring, Control
and Optimization* (eds. B. M. Colosimo and E. del Castillo), Chapman
& Hall/CRC, New York, 2007, pp.311-322.

3. L. A. Trinca and S. G. Gilmour (2001) `Multi-stratum response surface
designs'. *Technometrics*, 43, 25-33.

4. S. G. Gilmour, J. M. Pardo, L. A. Trinca, K. Niranjan and D. S.
Mottram (2000) `A split-unit response surface design for improving aroma
retention in freeze dried coffee'. *Proceedings of the 6th European
Conference on Food-Industry and Statistics*, 18.0-18.9.

5. P. Goos and S. G. Gilmour (2012) `A general strategy for analysing
data from split-plot and multistratum experimental designs'. *Technometrics*,
54, 340-354.

6. S. G. Gilmour and L. A. Trinca (2012) `Optimal design criteria for
statistical inference (with discussion)'*. Applied Statistics*, 61,
345-401.

**Details of the impact**

Gilmour's research has led to changes and improvements in guidelines on statistical procedures issued by the CEC, and has led to Shell Global Solutions incorporating his methodology in their portfolio of statistical techniques used to estimate fuel economy and other benefits from experimental data. His work has refined and improved the technical quality of their processes, with a concomitant financial benefit although this is difficult to quantify.

Test methods for ascertaining the performance of fuels and lubricants need to be precise and reliable. In order to ensure the former, round robin studies are performed, where identical samples of, for example, a fuel are tested repeatedly within the same and across different laboratories.

Important numerical outcome measures of this process are the repeatability and reproducibility. These precision measures reflect the consistency of repeat measurements made at the same and at different laboratories respectively. The formulae for calculating these measures involve variance components, which are estimated from the data. Since in cost-constrained experimentation the number of times which measurements within labs are replicated is usually small, the methods in the ISO 5725 standard used by industry can lead to negative estimates of the laboratory-to-laboratory variance component. This is unrealistically assumed to be zero in standard computer packages and procedures. This phenomenon has been observed by Shell and at CEC. The approach of Gilmour, which underpins this impact, offers a solution to this problem.

The methodology has been used in round robin experiments for two test methods (nozzle fouling and low temperature pumpability) by the Senior Consultant Statistician at Shell, and the Chairman of the CEC Statistical Development Group and Statistical Advisor at Infineum who formulate, manufacture and market petroleum additives. A presentation has been made to other members of the CEC Statistical Development Group on the methodology, including examples of WinBUGS and SAS code, and this is stored online in the private CEC-SDG working area.

*Impact on European guidelines*

CEC is the Coordinating European Council, the European Fuels and
Lubricants Performance Test Development Organisation. It represents the
motor, oil and petroleum additive industries and develops test methods to
evaluate the likely performance of different fuels and lubricants in the
field. Their test methods are used in member laboratories throughout
Europe and their performance is monitored by the regular testing of
reference fluids of known performance. Industrial statisticians support
each test method to ensure consistency and accuracy of application.
Normally laboratories would follow the ISO 5725 Accuracy [trueness and
precision] of measurement methods and results standard, as detailed in the
CEC manual. However, CEC began to notice that, in some of their round
robin testing, there was a problem with negative variance components,
particularly in studies constrained by cost where samples are tested just
once or twice in a limited number of laboratories. This was leading to
unreliable estimates of repeatability and reproducibility. Given the
significant impact of unreliable precision data across the industry, CEC
sought to revise their approach. As a result of Shell's Senior Consultant
Statistician presenting to both Shell and CEC on the Gilmour method, our
research [1] now features as an element of the CEC Statistics Manual
Procedures [8] as an alternative to ISO 5725. The Chair of the CEC
Statistical Development Group [7] confirms that "...*the Gilmour method
tends to give precision estimates which look more reasonable since they
have a sensible prior which the estimates shrink towards when there is
lack of information*". He also indicated that the methodology is
likely to be used in further round robin analyses for these two methods
and/or others. It might also prove useful when the number of participating
laboratories and hence the degrees of freedom for reproducibility are
small.

*Redefining statistical protocols for industry*

After attending Gilmour's talk on `Multi-Stratum Response Surface Designs
in Industrial Experiments' at the International Symposium on Business and
Industrial Statistics, Shell Global Solutions first became interested in
his work and its potential application in their fuel technology
development programme. Following the subsequent publication of [1], and
its conclusion that a Bayesian approach culminates in a more appropriate
analysis, members of the Shell Global Solutions statistics team [9, 10]
contacted Gilmour for details about the method as they were keen to employ
it at their Research and Technology Centre at Thornton in Cheshire. Their
Senior Consultant Statistician [9] stated that Gilmour "*seemed to have
a solution to a problem we had identified*".

The Global Solutions team were involved in the testing of different fuels in a number of cars to measure fuel economy. Because of the limited number of experimental runs possible and logistical constraints, Shell's research group cannot always use conventional, fully randomized, experimental designs. Some variables are harder to change than others and cannot be altered on a trial-to-trial basis. For example, the car age and engine setup is harder to change than the fuel formulation. The real-life driving conditions encountered during testing cannot be controlled at all. As a result, experimental design and analysis becomes extremely challenging. In a `simple' run of experiments, where the variables can be altered from test to test ad lib, all treatment effects can be tested against a single error term in the subsequent analysis of variance. However, in Shell's case, some variables can only be changed from day-to-day rather than from run-to-run for example. Such variables need to be tested against an appropriate combination of the day-to-day and within-day variance components. But as the data available from experiments is often sparse, the estimates of the error terms were unreliable. As a result, the experiments that Shell were running were not as informative as they required. With the support of a Knowledge Transfer Partnership grant, Gilmour placed his PhD student Lutfor Rahman in the Senior Consultant Statistician at Shell's department for a three-month placement, during which time they developed a step-by-step method for the implementation of their new analytical method so that it could be used in Shell's ongoing programme of fuel technology development. The Gilmour method was used to analyse fuel economy data sets which had a day-to-day component of variance which was difficult to estimate as the number of days' testing was generally small. Standard methods were often arriving at negative estimates, which the Shell team recognised as unrealistic. By using the Gilmour method in addition to their own methods during the project, the Shell team were able to use the Gilmour Bayesian analysis to gain confidence in the results and as a consequence not recommend further expensive testing. Shell Global Solutions have now added to their repertoire of techniques this type of multi-stratum analysis as a solution to an identified class of problems.

**Sources to corroborate the impact **

- Chairman of the CEC Statistical Development Group [Impact of the research on European guidelines for fuels and lubricants performance testing, in particular, on modifications by CEC to the ISO 5725 standard.]
- www.cectests.org/cec-constitution.asp [click on Statistics; Statistics Manual — Procedure 1]
- Senior Consultant Statistician, Shell Global Solutions, UK [inclusion of techniques based on QMUL research in Shell's portfolio of analysing precision studies and assessing test methods, with improved technical quality of processes and boosting financial efficiency.
- Consultant Statistician, Shell Global Solutions, UK [inclusion of techniques based on QMUL research in Shell's portfolio of analysing precision studies and assessing test methods, with improved technical quality of processes and boosting financial efficiency]