Skip to main content

Table 2 Examples of bias errors in EMR and claims data

From: Big data are coming to psychiatry: a general introduction

Study description

Issue

Errors found

Patient source

References

Examine relationship between illness severity and quantity of data in EMR

Data sufficiency

Setting minimal data requirements for inclusion in a study cohort created bias toward selection of sicker patients

EMR records from 10,000 patients who received anesthetic services

Rusanov et al. (2014)

Investigate patterns in lab tests for potential impact on use in modeling EMR data

Context for interpreting lab tests results

Frequency of lab tests confounded by scheduled visits, such as every 3 months

EMR records from 14,141 patients

Pivovarov et al. (2014)

Repeat prior study of pneumonia severity index to demonstrate bias in EMR retrospective research

(a) Diagnostic consistency

Adding constraints to improve consistency of diagnostic cohort significantly changed the sample (decreased the size)

EMR records from 46,642 patients with indication of pneumonia

Hripcsak et al. (2011)

(b) Small number of cases can have large impact on outcome

Very sick patients who die quickly in ER will not have symptoms entered into EMR, impacting mortality rates

  

Investigate concordance of diagnosis of PTSD in EMR with diagnosis determined by SCID interview

Diagnostic accuracy

Over 25 % of EMR diagnoses in veterans were incorrect for PTSD. Those with least and most severe symptoms most likely to be accurate

Sample of 1649 veterans

Holowka et al. (2014)

Evaluate diagnosis of schizophrenia in EMR compared with chart review by psychiatrist

Diagnostic accuracy

Prevalence of schizophrenia was 14 % by coding, dropping to 1.8 % with manual review. Coding most accurate (74 %) for those with four or more coding labels

819 veterans in a pain clinic

Jasser et al. (2007)

Review whether written informed consent introduces selection bias in prospective observational studies using data from EMR

Written informed consent

Significant differences between participants and non-participants with inconsistent direction of effect

Review of 1650 citations. 17 studies included with 69 % of 161,604 eligible patients giving consent

Kho et al. (2009)

Analyze if underlying health of seniors impacts risk reduction for death and hospitalization associated with influenza vaccine

Selective prescribing of preventative measures

Greatest reduction in risk occurs before influenza season, indicating preferential receipt of vaccine by healthy seniors

72,527 people ≥65 years not residing in nursing homes, using plan administrative data

Jackson et al. (2006)

Investigate surprising protective effects attributed to preventative medications by examining association between statin use and motor vehicle and workplace accidents

Healthy-adherer bias (adherent patients more health seeking)

Statin users significantly less likely to be involved in motor vehicle and workplace accidents. Example of unmeasurable confounding in dataset

141,086 patients taking statins for prevention

Dormuth et al. (2009)

Passive case-finding for Alzheimer’s disease and dementia using medical records

Research center population not generalizable

Research center population younger, more severe disease, more educated than general population

5233 patients over age 70

Knopman et al. (2011)

Explore selection bias when comparing outcomes from cancer therapy using observational data in SEER database

Severity of illness, self-rated health, comorbidities

Improbable results. Adjustment techniques such as propensity scores insufficient. Some outcome measures caused by treatments

53,952 patients with prostate cancer in three therapy groups

Giordano et al. (2008)