Skip to main content

Table 2 Examples of bias errors in EMR and claims data

From: Big data are coming to psychiatry: a general introduction

Study description Issue Errors found Patient source References
Examine relationship between illness severity and quantity of data in EMR Data sufficiency Setting minimal data requirements for inclusion in a study cohort created bias toward selection of sicker patients EMR records from 10,000 patients who received anesthetic services Rusanov et al. (2014)
Investigate patterns in lab tests for potential impact on use in modeling EMR data Context for interpreting lab tests results Frequency of lab tests confounded by scheduled visits, such as every 3 months EMR records from 14,141 patients Pivovarov et al. (2014)
Repeat prior study of pneumonia severity index to demonstrate bias in EMR retrospective research (a) Diagnostic consistency Adding constraints to improve consistency of diagnostic cohort significantly changed the sample (decreased the size) EMR records from 46,642 patients with indication of pneumonia Hripcsak et al. (2011)
(b) Small number of cases can have large impact on outcome Very sick patients who die quickly in ER will not have symptoms entered into EMR, impacting mortality rates   
Investigate concordance of diagnosis of PTSD in EMR with diagnosis determined by SCID interview Diagnostic accuracy Over 25 % of EMR diagnoses in veterans were incorrect for PTSD. Those with least and most severe symptoms most likely to be accurate Sample of 1649 veterans Holowka et al. (2014)
Evaluate diagnosis of schizophrenia in EMR compared with chart review by psychiatrist Diagnostic accuracy Prevalence of schizophrenia was 14 % by coding, dropping to 1.8 % with manual review. Coding most accurate (74 %) for those with four or more coding labels 819 veterans in a pain clinic Jasser et al. (2007)
Review whether written informed consent introduces selection bias in prospective observational studies using data from EMR Written informed consent Significant differences between participants and non-participants with inconsistent direction of effect Review of 1650 citations. 17 studies included with 69 % of 161,604 eligible patients giving consent Kho et al. (2009)
Analyze if underlying health of seniors impacts risk reduction for death and hospitalization associated with influenza vaccine Selective prescribing of preventative measures Greatest reduction in risk occurs before influenza season, indicating preferential receipt of vaccine by healthy seniors 72,527 people ≥65 years not residing in nursing homes, using plan administrative data Jackson et al. (2006)
Investigate surprising protective effects attributed to preventative medications by examining association between statin use and motor vehicle and workplace accidents Healthy-adherer bias (adherent patients more health seeking) Statin users significantly less likely to be involved in motor vehicle and workplace accidents. Example of unmeasurable confounding in dataset 141,086 patients taking statins for prevention Dormuth et al. (2009)
Passive case-finding for Alzheimer’s disease and dementia using medical records Research center population not generalizable Research center population younger, more severe disease, more educated than general population 5233 patients over age 70 Knopman et al. (2011)
Explore selection bias when comparing outcomes from cancer therapy using observational data in SEER database Severity of illness, self-rated health, comorbidities Improbable results. Adjustment techniques such as propensity scores insufficient. Some outcome measures caused by treatments 53,952 patients with prostate cancer in three therapy groups Giordano et al. (2008)