SUBMISSION TO

THE LESSONS LEARNED INQUIRY

 INTO FOOT AND MOUTH

 

 

 
 
 
 
 
 
 
 
By

 

Valerie Lusmore

The White House

Orchard Rise

Pwllmeyric

Monmouthshire

NP16 6JT

 

Tel: 01291 623573

Email: val_lusmore@hotmail.com

 

 

 

AUTHOR

 

I have degrees in Mathematics and Applied Mathematics from the University of Natal and I have worked in the computer industry since 1969.  My early work was as a consultant to large international companies, very much at the cutting edge of technology and I was involved in computer modelling on many occasions. In the past 10 years I have evolved into a data specialist.

 

This role developed as I realised that many of the problems in complex computer systems lie in the data rather than the system. The problems are exacerbated by databases containing very large numbers of records, some of which are not consistent with the original data specifications. This leads to anomalies in calculations and processes that do not work correctly.

 

During the 2001 FMD outbreak I was a founder member of the National Foot and Mouth Group. When people kept telling me that they could not understand the daily data published by MAFF/DEFRA publicly on their FMD website, I made a daily analysis and published regular summaries for those who had an interest. I kept as complete a record as I could so that other people could use my data as a resource to get accurate and meaningful figures.

 

 

 


 

INDEX

                                                                                                            Page

1.             Introduction                                                                             3                     

2.             Executive summary                                                                  3

3.            Mathematical Models                                                             

                        3.1 Pirbright scientists and modelling FMD             4

3.2 The mathematical modelling teams                         4

                        3.3 The computer modelling tools                           5         

                        3.4 Papers written by the modelling teams               5

                        3.5 Main problem with modelling - the data                        6

                        3.6 Modelling backwards                                            7

 

4.             Information Collection

            4.1 Initial data setup                                                    8         

                        4.2 Data integrity                                                    8         

                        4.3 Summarising the Data                                               9

4.4 Example of Inconsistency and Inaccuracy in MAFF statistics  10

                        4.5 Collection History for IP data                              12

                        4.6 County Tables for Slaughter Statistics                       13

 

5.             Examples of Data Problems

                        5.1 Analysis of Statistics                                               15        

                        5.2 Name and Address anomalies                                  16

                        5.3 Data validation                                                  16

                        5.4 Constituency Tables - why do they exist?                17

                        5.5 Changing the Tables                                                  18

                        5.6 Analysis of the new County List by Local Authority            19

 

6.         Other Methodology

                        6.1 Other modelling methods by volunteers                     20

                        6.2 Maps                                                                20

                        6.3 Tracings                                                                       20

                        6.4 Picture of Epidemic spread using more traditional methods            21

                        6.5 Results of Picture                                                  23

                        6.6 Local Spread?                                                   24

 

7.         IT systems and resources

7.1 How could the data have been improved?                     25

7.2 Organisation and communication                         25

                        7.3 Contingency planning                                                26

 

8. Conclusions                                                                                       27

 

9. Appendices

                        9.1. Bibliography                                                      28        

9.2. County and Local Authority Lists - a comparison            29

9.3 Example of 'pictorial' weekly reports drawn from DEFRA statistics

                                                                                                            32                    

 

 


1. Introduction

 

The policies used during the FMD crisis of 2001 were mainly driven by epidemiologists and bio-mathematicians.  These policies were brought in hurriedly at the end of the fifth week into the crisis when there were concerns that the disease was already out of hand.  The main policy that the modellers developed was that of the contiguous cull.

 

Along with many other people with an interest in the subject I tried to understand the factors behind this model - and found it difficult to understand why this particular approach had been used especially as there was much information available on the Internet of other methods of controlling the disease that were used in other parts of the world. 

 

Having trained as a mathematician and scientist originally I was extremely concerned that 'mathematicians and scientists' appeared to be making critical decisions as to the policy to be used to control an animal disease epidemic. Their methods affected the life of the whole rural community and their decisions and methodology were reflecting badly on both mathematics and science.

 

In particular, when I began to look at the available data about the UK outbreak and where it was situated, it very quickly became apparent to me that the information was incomplete, inaccurate, inconsistent and difficult to use.  This led me to develop consistent ways of reporting the MAFF/DEFRA information so that it was clear and simple to understand.

 

 

2. Executive summary

 

This paper covers the mathematical models - and what I consider important about them.

This includes brief descriptions of the groups of people involved, their experience of the subject; the computer modelling tools available.  There are a few comments on the scientific papers produced by the different groups justifying their work and a section on the main problems with the modelling itself.

 

I then move on to the major importance of the information needed to drive the models, how it was set up, examples of what information was available; discussion of the management and control needed for such information; and a commentary on the integrity and accuracy of the data itself.

 

The section on the data problems covers simple examples which demonstrate how I came to believe that the quality of the data was such that it was impossible for the modellers to give an accurate picture of what was happening.

 

The next section then postulates whether other methods could and should have been used. It describes some traditional techniques I used with the information available to the modellers at the time and the pictures that I created of how the epidemic was spreading.  This leads into a section on IT systems and how they were just not up to modern standards.  IT should have been used far more effectively to free up resources and to aid with communications.

 

My conclusions follow - the main one being that there was pressure from political sources to come up with a quick solution.  The data available for the mathematical modelling was such that it would have been better to use other more traditional methodology right at the beginning to get control of what was happening to the disease.  Modelling should have been used as a simple adjunct to other methods - not to shape the policies.


 

3. Mathematical Models

 

3.1 Pirbright scientists and modelling FMD

 

The scientists at Pirbright, world reference laboratory for FMD, had been working with mathematical models to study various aspects of foot and mouth disease, since the 1970s. The EUFMD research group discussed research papers on modelling at their annual conferences in both 1999 and 2000 as well as in earlier years. The group was led by Alex Donaldson and Paul Kitching who were world class specialists. Both had worked on the EUFMD research group for a number of years and knew most of the other international specialists on this subject.

 

The EUFMD group had a great deal of knowledge of the pan-Asian O strain of FMD and there had been increasingly serious discussions for at least two years of how to cope with this strain when it arrived in Europe.  There had also been discussions of how to cope with the logistics of slaughter and disposal of large numbers of carcasses as it was felt that increasingly this would lead to a public outcry in many countries.

 

In Alex Donaldson's 1999 paper he wrote about the earliest model created through collaboration between IAH,  Pirbright, and the UK Meteorological Office, a computer-based model was developed during the 1970s for assessing the risk of airborne spread of FMD. It was created by bringing together data on the aerobiology of FMD with data on the physical behaviour of particles in the atmosphere under different climatic conditions. The model was shown to be capable of giving a prediction within a few of hours of the confirmation of an outbreak of FMD of whether there was a risk of spread and, if so, which farms were in jeopardy. The model could predict accurately up to a distance of 10 km from the source. Any farms considered to be at risk could be placed under intensive surveillance so that suspected cases could be quickly identified and eliminated. The model was used successfully under operational conditions during the outbreaks of FMD on Jersey and the Isle of Wight in March 1981.

 

 

3.2 The mathematical modelling teams

 

MAFF invited 3 teams of mathematical modellers to assist with analysis and prediction of the outbreak - a fourth team from Imperial College independently created their own model.

 

I will refer to them as:

THE VLA team - Professor Wilesmith from the State Veterinary Labs agency was backed up by colleagues from Massey University in New Zealand  - they had worked together previously on the BSE problem.

 

The Cambridge team - Professor Grenfell from Cambridge and his colleagues - they were very experienced at analysing data from an epidemic against various factors to see if fresh insights could be obtained (eg measles, Soay sheep, etc)

 

The Edinburgh team - Professor Woolhouse and colleagues - had the most expertise in Geographic Information systems and were called upon by the other groups in this area.

 

The Imperial team - Professor Anderson and his colleagues from Imperial College were well known to the Chief Scientist but were not among the groups originally asked to take part. They had experience of BSE modelling and various (mostly human) epidemic diseases. The team had recently moved (Nov 2000) from Oxford to a new department researching Human Health headed by Professor Anderson at Imperial College.  Their only experience of FMD was that Ms Donnelly had co-authored a paper in July 2000 where the data from the FMD epidemics in UK 1967 and Taiwan in 1997 was run through epidemic simulations. The conclusions of that simulation exercise was that it was imperative that herds be slaughtered on the day that disease was confirmed and that resources should be available to implement this policy should an outbreak occur.

 

 

 

3.3 The computer modelling tools

 

Epiman database - developed at Massey University in the early 1990s and used to track and manage outbreaks.  Had been adapted for EU conditions and tested by various European groups of FMD researchers.  Purchased by MAFF some years previously to the 2001 UK outbreak but not set up with data.  Needs time to be set up 

Quote from a Dutch team of researchers in the mid-1990s  'The EpiMAN(EU) GIS application was user friendly and provided the user with good tools to facilitate certain tasks in the control of a FMD outbreak. The system could be used in the Netherlands, and has potential for other countries as well. However, digitised data has to be available in advance of an outbreak, which is not completely the case yet in the Netherlands. To fully use the possibilities of a DSS such as EpiMAN(EU), a permanent, updated database with farm full information, including farm locations, is necessary.' 

This database provided the information for the associated Interspread model which was used for predictions and modelling.

 

Cambridge model - more complex model - contains more detail, in terms of describing transmission between individual farms, as a random process, allowing for more heterogeneity: differences between farms, in terms of numbers of animals, different species.

 

Imperial model - adapted by Neil Ferguson and Christl Donnelly from calculations that used the transmission of human sexually transmitted diseases to model spread together with knowledge gleaned from their work on BSE.  Simple model - generalised animal species, and constant infectivity assumed. 

 

Rimpuff Model together with GIS system - developed by University of Denmark in 1990s and adapted for use to predict plumes of virus and spread where predicted weather factors are included.

 

 

3.4 Papers written by the modelling teams

 

There were a number of scientific papers written by the teams of modellers whose work was used by the government to determine their policies.

 

The paper published by the Imperial team in May describes the 'model' they used which led government policy to use the contiguous cull of all livestock within 3kms of an Infected Premise  (IP). 

 

This model relies heavily on the 'contact tracing' carried out by MAFF in the first 3 weeks and discovered from this information that farms within 3 kms of an IP appeared to be at greater risk of infection. There was no differentiation between different species although information about the different infectivity of sheep, cattle and pigs to this particular strain of FMD was readily available.  There was another assumption that infectivity on a farm is constant from day 3 after infection to day 11.

 

Presumably the 3km spread was assumed to be via windborne transmission, although the Pirbright team knew that this strain of virus did not spread that way over more than 200 metres.

 

In the subsequent paper published by this team in October significant bias in the contact tracing was uncovered and it was suggested that 'local' spread may have been via personnel or vehicles; it was also later discovered by analysis of what had happened that there was significant differences in infectivity between different sizes of farm and types of animal.

 

In a paper by the Edinburgh and Cambridge teams analysing the epidemic data afterwards, it was again discussed that there was a bias in the contact tracing data towards 'local' contacts without discovering how the disease was spread.  They also suggest that in reality there are epidemic dynamics within a farm which means that infectivity changes over time.  These became more significant as delays in culling infected animals (as well as all the contiguous stock) built up.  Analysis found significant differences in infectivity between species with cattle being more liable to infection and sheep being relatively little affected.

 

All these facts were known by the Pirbright team before this epidemic occurred (published in May 2001) - but were not fully taken account of by the modellers.

 

 

3.5 Main problem with modelling -the data

 

The main problem with all the models was not the methodology or the assumptions, but the quality and integrity of the available data.  This was not of a standard consistent with modern practice - there are comments in most of the papers by the modelling teams about the data. These range from the relatively mild comments from the Cambridge team about 'lacunae' in the data, to Anderson of Imperial's comments to the Parliamentary agriculture committee that several of the farms were, according to MAFF's figures, situated in the North Sea.

 

From my own experience of the data, checked and gathered every single day since I became involved in this epidemic, I know that the data which MAFF/ DEFRA published is extremely inaccurate. Any experienced data analyst would have realised that there was no point in continuing with the modelling unless significant effort was put into validating, verifying and correcting the available data.   Until the data validity was improved only extremely simplistic models could be run.

 

 

 


 

3.6   Modelling backwards 

 

When running simulations using a computer, there is very often a situation where the people requesting the results know what their desired 'answer' is, and they run several scenarios making adjustments to the input parameters until the 'right' answer appears.  This is a perfectly valid methodology under certain situations eg when modelling for financial decision making and the maximum spend budgetted for is already known. 

 

Under these circumstances, an experienced modeller may recognise that the question is no longer 'what happens to the totals if we use x, y and z as values ?'  BUT  'which values of x, y and z will give the acceptable answer for the totals?'  and adjust the basic equations (depending on complexity) to solve for the desired values of the parameters x, y and z. 

 

This can be feasible - and saves wasting a lot of time running simulations.     The 'models' used for studying the FMD spread might reasonably have changed from the questions on 'where is this spreading to, and how long will it take to come under control?'  to ' what do we need to do to get this over by a certain date?'.

 

After all there were various scenarios, such as 'kill every susceptible animal immediately' which would have achieved the desired result within a very short period, always supposing that infinite resources were available. 

 

The problem in this situation is knowing whether the people building the model are sufficiently experienced to recognise this is happening and thus have sufficient understanding of what is really required, as opposed to what they have been asked for. 

 

The people involved in actually doing the modelling may have been relatively inexperienced in such a 'political' environment where the questions asked are not straightforward.  Too many of the people on the science committee were the senior academics and researchers who spend most of their time organising funding for their academic department rather than actually doing any academic work.  They are more used to a political environment.

 

Whether the 'right' questions were asked is unknown but purely from reading the press releases on the subject of modelling I got a distinct impression that 'backward' modelling techniques may well have been used.  The values which are discovered by these 'back' techniques were then run through the simulation models to give the required answers.   This would no doubt account for very similar results being returned from all the different models. 

 

 

Another example which concerns me about the questions asked was that the scenarios used for the modelling of possible vaccination strategies were not realistic and could not have been practically implemented. Clearly the wrong questions were asked.

 

'Local' as a reason for the spread of FMD is used without any explanation or justification as if just being within a certain distance of an IP made a farm more likely to develop the disease.

Again the question of why 'local' spread occurred was not asked in a sufficiently rigorous manner to elicit the answers.

 

Some of the later scientific papers appear to have been written more to justify their earlier decisions and to defend the 'contiguous cull'. Further simulated results are used to justify what 'might have happened' without necessarily using better or more correct information.  The papers appear to be more to ensure further funding of research than to genuinely find out what happened.
           

4. Information Collection

           

4.1 Initial data setup

 

The information required for the models is in two sections:

            Firstly the general information of the overall picture of the locale

-        both for farming and geography of the areas

 

Secondly, the particular details of each IP in the FMD outbreak - by location and type of livestock involved

 

Ideally the general information should already have been available and fed into a Database  (presumably that was why the Epiman database was purchased by the MAFF two years previously). This would be useful and appropriate under contingency procedures for ANY outbreak of disease. 

 

Once the foot and mouth disease outbreak was identified in Britain, four New Zealand experts were sent immediately to work with British colleagues to get the system up and running urgently. Professor Morris of Massey University said that loading all the data and getting the program running was done in four days, by cutting corners to get available data into the system as quickly as possible, "warts and all", rather than methodically and calmly as intended.

 

In this case, the data had to be hurriedly gathered and converted from many different sources - not all of which were compatible.   There was very little time for checking the accuracy or consistency of the available data on the 144000 holdings in the UK.  Data was collected from:

            June 2000 agricultural census

            Local MAFF office databases (Vetnet)

            Database created for Swine fever outbreaks

 

From these sources a database of all farms was created to be used in conjunction with the data collected from the Infected Premises. Basic data from this database was then immediately available to be passed to the Epiman database of information about the IPs and used in conjunction with the Interspread model.

 

 

4.2 Data integrity

 

Unfortunately this data was flawed from the start:

The June census figures are not consistently collected and the numbers of animals on holdings change considerably during the year, especially in February where the number of sheep on each holding need to be maximised for the sheep premium. 

Overall geographical mapping data loaded from current sources did not necessarily match the Vetnet data collected from a number of sources over a period of years. This data was often seriously out of data with regard to crucial factors such as map coordinates, addresses, postcodes, local authorities and counties.

 

Holding data had not been removed for farms that had stopped farming many years before. Holding numbers were not unique.  I was told that there were no computer systems available in local MAFF offices at the start and inexperienced personnel were drafted in to copy all this data in from the Vetnet system, and that inadequate procedures were followed for checking the data for accuracy.

 

All of these items were crucial for accurate modelling.  The data for the infected premises was not sufficiently accurate or consistent to model clearly -nor was it particularly accurate for the MAFF officials who needed to go round the local area to put on the D notices and movement restrictions.   This was certainly observed at almost every stage throughout the epidemic and must have been even worse at the beginning. 

 

Data on the outbreak was gathered by MAFF but, as they were not ready to deal with the outbreak, the accuracy and consistency of that data was also poor.  This became very apparent as soon as data was published on the MAFF website daily for public viewing.

 

This data was widely published in the newspapers and on the BBC.

 

Almost as soon as the data was published the inconsistencies began to show.

 

 

4.3 Summarising the Data

 

One of the main decisions to be made when setting up data of this type, is the levels to which the data may be aggregated for various purposes. There may be several different data fields that are stored merely so that data can be aggregated - for example, counties, type of holding, which office they report to, etc etc.  The decision must be taken when setting up the data as to exactly which lists are to be used, otherwise all data previously loaded has to be changed whenever decisions are made to use a different categorisation.

 

In the case of the county lists on which the data was based this was never sorted out and even today some of the data is aggregated to one set of 'counties' (local authorities) and other information is aggregated to the 'counties' as they were in 1979.  This may seem in itself a small point - but it ensures that it is impossible to completely reconcile the information in the two sets of tables! 

 

This has been true throughout the running of the epidemic - the 'local authority' table used to tell the general public where the Infected Premises were located, has been amended constantly since last year, and has only just (Jan 18th 2002) been corrected. 

 

Because the data has been changed so often, the total number of IPs (Infected premises) assigned to a particular county on the Totals page is often different from the number of Names and addresses on the actual Table; the total number of all these sub-totals is often different from the actual number of IPs at any point in time.

 

When producing large amounts of data in a computer it is common practice to produce 'control totals' which ensure that all the data is actually entered into the system. At any point, adding up details of the data gives a check as to whether all the data is included. This principle has been totally ignored throughout the course of this epidemic - the total of the number of cases on the county summary tables often did not add up to the number of cases in the database.

 


4.4 Example of Inconsistency and Inaccuracy in MAFF statistics

 

A telling example is the slaughter statistics - three different sets of numbers are published on the DEFRA website every day which give the following details:

 

List 1: the number to date of animals slaughtered rounded to thousands - categorised by cattle, sheep, pigs, goats, deer and 'other' - this is over all premises whatever the status - (infected, direct contact, contiguous and 'slaughtered on suspicion')

 

List 2: a complete table by counties of the actual animals slaughtered with one row each for cattle, sheep, pigs, goats, deer and 'other' - and one column each for the 4 different kinds of premises

 

List 3: a summary over all counties of the data on List 2

 

Logically these 3 lists should match each other but they do not. To demonstrate this here is an example from a random day taken recently.


 

Examples below:  (taken from DEFRA as at 14/1/2002 published on 15/1/2002)

 

List 1: from DEFRA as at 17:00 on 14/1/2002 published on 15/1/2002

 

4,050,000 animals recorded as slaughtered (594,000 cattle, 3,310,000 sheep, 142,000 pigs, 2,000 goats, 1,000 deer, 1,000 other animals slaughtered

 

List 2:  base slaughter data by county TOTALLED over all counties - from DEFRA as at 17:00 on 14/1/2002 published on 15/1/2002

 

Animals

IPs

DC

Non-contiguous

SOS

Total

Cattle Total

304934

193267

82151

14356

594708

Sheep Total

953989

978262

1270834

110902

3313987

Pigs Total

20204

48944

70714

2543

142405

Goats Total

934

664

544

293

2435

Deer Total

25

578

411

3

1017

Other Total

283

306

0

3

592

Grand Total

1280369

1222021

1424654

128100

4055144

 

List 3 - from DEFRA summary of data over all counties as at 17:00 on 14/1/2002 published on 15/1/2002

 

Total animals slaughtered

Infected
premises

DC Contiguous
premises

DC Non contiguous
premises

Slaughter on suspicion

Grand Total

Cattle

304934

193267

82151

14356

594708

Sheep

953989

979505

1270834

110902

3315230

Pigs

20204

51594

70714

2543

145055

Goats

934

666

544

293

2437