Addressing ascertainment bias in the study of cardiovascular disease burden in opioid use disorders

Dec 3, 2021 to Dec 3, 2021

Addressing ascertainment bias in the study of cardiovascular disease burden in opioid use disorders - application of natural language processing of electronic health records


Description In the United States, the prevalence of long-term exposure to opioid drugs, for both medically and nonmedically indicated purposes, has increased considerably since the mid-1990’s. Concerns have emerged about the potential health effects of opioid use. There is also growing interest in other possible connections with opioid use including cardiovascular disease (CVD). Electronic health records (EHR) contain information about patient care in the form of structured codes and unstructured notes. Natural language processing (NLP) provides a tool for processing unstructured textual data in HER clinical notes and extracts useful information for research with structured formats. The purpose of this dissertation was to were 1) to summarize peer-reviewed literature on the association between non-acute opioid use and CVD; 2) to use NLP methods to estimate the extent of OUD among hospital inpatients that cannot be identified using ICD codes; and 3) to determine the extent to which estimates of the association between OUD and CVD may be biased by misclassification of OUD cases that are not identifiable using ICD codes.


First, we conducted a scoping review of the epidemiological literature on nonacute opioid use and CVD, to summarize the current evidence about the association between NOU and CVD, and to identify open questions on this topic. Then, we developed a Natural Language Processing algorithm to identify cases of OUD in electronic patient records that were not assigned an ICD-10-CM code for OUD by medical records coders, but for which strong evidence of OUD exists in the unstructured clinical notes. Last, we estimate the association between OUD and six types of CVD, arrythmia, myocardial infarction, stroke, heart failure, ischemic heart disease, and infective endocarditis, with classifying OUD in two ways: using ICD codes alone, and using a combination of ICD codes and cases of OUD identified using NLP. We assess the effect of misclassification of OUD status using ICD codes alone.