Big data are coming to psychiatry: a general introduction

Monteith, Scott; Glenn, Tasha; Geddes, John; Bauer, Michael

doi:10.1186/s40345-015-0038-9

International Journal of Bipolar Disorders

Table 2 Examples of bias errors in EMR and claims data

From: Big data are coming to psychiatry: a general introduction

Study description	Issue	Errors found	Patient source	References
Examine relationship between illness severity and quantity of data in EMR	Data sufficiency	Setting minimal data requirements for inclusion in a study cohort created bias toward selection of sicker patients	EMR records from 10,000 patients who received anesthetic services	Rusanov et al. (2014)
Investigate patterns in lab tests for potential impact on use in modeling EMR data	Context for interpreting lab tests results	Frequency of lab tests confounded by scheduled visits, such as every 3 months	EMR records from 14,141 patients	Pivovarov et al. (2014)
Repeat prior study of pneumonia severity index to demonstrate bias in EMR retrospective research	(a) Diagnostic consistency	Adding constraints to improve consistency of diagnostic cohort significantly changed the sample (decreased the size)	EMR records from 46,642 patients with indication of pneumonia	Hripcsak et al. (2011)
	(b) Small number of cases can have large impact on outcome	Very sick patients who die quickly in ER will not have symptoms entered into EMR, impacting mortality rates
Investigate concordance of diagnosis of PTSD in EMR with diagnosis determined by SCID interview	Diagnostic accuracy	Over 25 % of EMR diagnoses in veterans were incorrect for PTSD. Those with least and most severe symptoms most likely to be accurate	Sample of 1649 veterans	Holowka et al. (2014)
Evaluate diagnosis of schizophrenia in EMR compared with chart review by psychiatrist	Diagnostic accuracy	Prevalence of schizophrenia was 14 % by coding, dropping to 1.8 % with manual review. Coding most accurate (74 %) for those with four or more coding labels	819 veterans in a pain clinic	Jasser et al. (2007)
Review whether written informed consent introduces selection bias in prospective observational studies using data from EMR	Written informed consent	Significant differences between participants and non-participants with inconsistent direction of effect	Review of 1650 citations. 17 studies included with 69 % of 161,604 eligible patients giving consent	Kho et al. (2009)
Analyze if underlying health of seniors impacts risk reduction for death and hospitalization associated with influenza vaccine	Selective prescribing of preventative measures	Greatest reduction in risk occurs before influenza season, indicating preferential receipt of vaccine by healthy seniors	72,527 people ≥65 years not residing in nursing homes, using plan administrative data	Jackson et al. (2006)
Investigate surprising protective effects attributed to preventative medications by examining association between statin use and motor vehicle and workplace accidents	Healthy-adherer bias (adherent patients more health seeking)	Statin users significantly less likely to be involved in motor vehicle and workplace accidents. Example of unmeasurable confounding in dataset	141,086 patients taking statins for prevention	Dormuth et al. (2009)
Passive case-finding for Alzheimer’s disease and dementia using medical records	Research center population not generalizable	Research center population younger, more severe disease, more educated than general population	5233 patients over age 70	Knopman et al. (2011)
Explore selection bias when comparing outcomes from cancer therapy using observational data in SEER database	Severity of illness, self-rated health, comorbidities	Improbable results. Adjustment techniques such as propensity scores insufficient. Some outcome measures caused by treatments	53,952 patients with prostate cancer in three therapy groups	Giordano et al. (2008)

Back to article page