Evaluation of record linkage of two large administrative databases in a middle income country: stillbirths and notifications of dengue during pregnancy in BrazilReport as inadecuate

Evaluation of record linkage of two large administrative databases in a middle income country: stillbirths and notifications of dengue during pregnancy in Brazil - Download this document for free, or read online. Document in PDF available to download.

BMC Medical Informatics and Decision Making

, 17:108

First Online: 17 July 2017Received: 10 April 2017Accepted: 10 July 2017


BackgroundDue to the increasing availability of individual-level information across different electronic datasets, record linkage has become an efficient and important research tool. High quality linkage is essential for producing robust results. The objective of this study was to describe the process of preparing and linking national Brazilian datasets, and to compare the accuracy of different linkage methods for assessing the risk of stillbirth due to dengue in pregnancy.

MethodsWe linked mothers and stillbirths in two routinely collected datasets from Brazil for 2009–2010: for dengue in pregnancy, notifications of infectious diseases SINAN; for stillbirths, mortality SIM. Since there was no unique identifier, we used probabilistic linkage based on maternal name, age and municipality. We compared two probabilistic approaches, each with two thresholds: 1 a bespoke linkage algorithm; 2 a standard linkage software widely used in Brazil ReclinkIII, and used manual review to identify further links. Sensitivity and positive predictive value PPV were estimated using a subset of gold-standard data created through manual review. We examined the characteristics of false-matches and missed-matches to identify any sources of bias.

ResultsFrom records of 678,999 dengue cases and 62,373 stillbirths, the gold-standard linkage identified 191 cases. The bespoke linkage algorithm with a conservative threshold produced 131 links, with sensitivity = 64.4% 68 missed-matches and PPV = 92.5% 8 false-matches. Manual review of uncertain links identified an additional 37 links, increasing sensitivity to 83.7%. The bespoke algorithm with a relaxed threshold identified 132 true matches sensitivity = 69.1%, but introduced 61 false-matches PPV = 68.4%. ReclinkIII produced lower sensitivity and PPV than the bespoke linkage algorithm. Linkage error was not associated with any recorded study variables.

ConclusionDespite a lack of unique identifiers for linking mothers and stillbirths, we demonstrate a high standard of linkage of large routine databases from a middle income country. Probabilistic linkage and manual review were essential for accurately identifying cases for a case-control study, but this approach may not be feasible for larger databases or for linkage of more common outcomes.

KeywordsData linkage Routine data Electronic health records Linkage quality Linkage accuracy Stillbirth Dengue AbbreviationsPPVPositive predictive value

ReclinkIIISoftware to link data used in Brazil

SIMMortality information system

SINANNotifiable diseases information system

Download fulltext PDF

Author: Enny S Paixão - Katie Harron - Kleydson Andrade - Maria Glória Teixeira - Rosemeire L. Fiaccone - Maria da Conceição

Source: https://link.springer.com/

Related documents