Methodology and Applications of Building a National File of Health and Mortality data

Authors

  • Leicester Gill University of Oxford, Department of Public Health and Primary care Unit of Health-Care Epidemiology, Oxford

DOI:

https://doi.org/10.17713/ajs.v33i1&2.433

Abstract

National collections of historical administrative and other health data can number hundreds of millions of records, with new data being added at the rate of tens of millions of records each year. Although improvements in computing and storage technology have to some extent kept pace with this accelerating growth in the datasets, there has been little development over the past few decades in the way in which probabilistic record linkage is undertaken, particularly in respect of the match acceptance thresholds and the clerical review processes, which are required to make decisions about matches which are doubtful.
This paper describes the major features of the Oxford Record Linkage Study (ORLS), and the developments in probabilistic matching methods and the use of intelligent and data mining methodologies to select potential links between pairs of records.
The ORLS linked file was developed using a collection of linkable abstracts that comprise a health region in the United Kingdom. The ORLS file contains 12 million records for 6 million people and spans 39 years. This dataset is used for the preparation of person linked health services statistics, and for epidemiological and health services research. The policy of the ORLS is to comprehensively link all the records rather than prepare links on an ad-hoc basis.
The ORLS have been developing improved techniques for deterministic and probabilistic linkage and developing algorithms for reducing the amount of clerical review, which is time consuming, expensive, and of variable quality. The methodology has been extended and refined for matching and linking other large UK government datasets, in particular the National Health Service Central Register (60+ million records), a number of disease and local authority registers, and more recently, for the development of a UK
National File of Linked Hospital Episode Statistics and Mortality data. This file spans 4 years and currently holds 52 million records and will increase by 14 million records per annum.
Since the implementation of the Data Protection Act (1998) in the UK, all names and address have been stripped from the health files. Matching and linkage is undertaken using the national NHS number and other partial identifiers. The matching methodology described in this paper is for linking such datasets using various combinations of the partial identifiers.

References

E.D. Acheson. Medical Record Linkage, Oxford University Press, London. 1967.

T. Blakely and C. Salmond. Probabilistic record linkage and a method to calculate the positive predictive value’, International Journal of Epidemiology, no. 31, pp.1246-52. 2002.

H. Dunn. Record Linkage, American Journal of Public Health, 36, pp 1412-1416. 1946.

L.E. Gill and J.A. Baldwin. Methods and technology of record linkage: some practical considerations. In: Baldwin.J.A., Acheson.E.D., and Graham W.J., (eds) Textbook of Medical Record Linkage. Oxford (Oxford University Press, 1987), pp 39-54. 1993.

L.E. Gill, M.J. Goldacre, H.M. Simmons, G.A. Bettley and M. Griffith. Computerised linkage of Medical Records: methodological guidelines. Journnal of Epidemiology and Community Health. 47 (1993) pp 316-319. 1993.

L.E. Gill. OX-LINK, The Oxford Medical Linkage System. In: Record Linkage Techniques 1997, Proceedings of an International Workshop and Exposition, Arlington, VA, March 20-21, 1997. Washington Federal Committee on Statistical Methodology, Office of Management and Budget, Washington.DC. 1997.

L.E. Gill. Methods for Automatic Record Matching and Linking and Their Use in National Statistics, National Statistics Methodology Series, No. 25, London: Office of National Statistics. 2001.

M.J. Goldacre, M. Griffith, L. Gill and A. Mackintosh. In hospital deaths as a fraction of all deaths within 30 days of hospital admission for surgery: Analysis of routine statistics, British Journal of Medicine, no. 334, pp. 1069-70. 2002

M.J. Goldacre, L. Kurina, D. Yeates, V. Seagrott and L. Gill. Use of large medical databases to study associations between diseases, Quarterly Journal of Medicine, no. 93, pp. 669-75. 2000.

M.J. Goldacre, S.E. Roberts and D. Yeates. Case fatality rates for meningococcal disease in an English population, British Journal of Medicine, no. 326, pp. 193-4. 2003.

T.J. Hayes. Algorithms for curve and surface fitting. In: Software for numerical mathematics. London and New York: Academic Press. pp. 219-233. 1974.

N.R. Kingsbury. Record Linkage and Privacy Issues in Creating New Federal Research and Statistical Information, United States General Accounting Office, GAO-01- 126SP. 2001.

H. Newcombe, J. Kennedy, S. Axford, A. James. Automatic Linkage of Vital Records. Science 130 (3381): 954-959. 1959.

H.B. Newcombe. The design of efficiency systems for linking records into individual and family histories. American Journal of Human Genetics 19: 335-339. 1967.

H.B. Newcombe. Record linking: the design of efficiency systems for linking records into individual and family histories. In: Baldwin JA, Acheson ED, and Graham WJ (eds), Textbook of Medical Record Linkage. Oxford: Oxford University Press. pp. 39-54. 1987

H.B. Newcombe H.B. Handbook of record linkage methods for health and statistical studies, administration and business. New York: Oxford University Press. 1988. (out of print)

M. Noble, M. Evans, C. Dibben and G. Smith. Changing Fortunes: Geographic Patterns of Income Deprivation in the late 1990s, Report for SEU/DETR/DSS, DLTR, London. 2001.

M. Noble, G. Smith et al. Lone Mothers Moving In and Out of Benefits, York: Joseph Rowntree Foundation. 1998.

Office for National Statistics. Longitudinal Study 1971-1991: History. Organisation and Quality of Data, TSO (London).1995.

E.H. Porter and W.E. Winkler. Approximate String Comparison and its Effect on an Advanced Record Linkage System. In: Record Linkage Techniques – 1997. Washington DC: National Academy Press. pp. 190-199. 1999.

S.E. Roberts and M.J. Goldacre. Time trends and the demography of mortality after fractured neck of femur in an English population, 1968-1998: database study, British Journal of Medicine, no. 327, pp. 7418-771. 2002.

D. Walsh, M. Small and J. Boyd. Electronic health summaries – Building on the foundation of the Scottish Record Linkage System. 2001.

T. Wilson and P. Rees. Linking 1991 population statistics to the 1998 local government geography of Great Britain, Population Trends 97. 1999.

W.E. Winkler. Record linkage software and methods for merging administrative lists, http://europa.eu.int/en/comm/eurostat/research/conferences/etk-99/papers/winkler.pdf 1999.

Downloads

Published

2016-04-03

How to Cite

Gill, L. (2016). Methodology and Applications of Building a National File of Health and Mortality data. Austrian Journal of Statistics, 33(1&2), 101–124. https://doi.org/10.17713/ajs.v33i1&2.433

Issue

Section

Articles