Automated identification of type 2 diabetes mellitus: code versus text
University of South Carolina
AUTOMATED IDENTIFICATION OF TYPE
2 DIABETES MELLITUS: CODE VERSUS
TEXTVanessa L. CongdonUniversity of South Carolina - Columbia
Follow this and additional works at:
Recommended CitationCongdon, V. L.(2014). AUTOMATED IDENTIFICATION OF TYPE 2 DIABETES MELLITUS: CODE VERSUS TEXT.
(Doctoraldissertation). Retrieved from
This Open Access Dissertation is brought to you for free and open access by Scholar Commons. It has been accepted for inclusion in Theses andDissertations by an authorized administrator of Scholar Commons. For more information, please contact .
AUTOMATED IDENTIFICATION OF TYPE 2 DIABETES MELLITUS:
CODE VERSUS TEXT
Vanessa L. Congdon
Bachelor of Science
Longwood University, 2007
Submitted in Partial Fulfillment of the Requirements
For the Degree of Master of Science in Public Health in
The Norman J. Arnold School of Public Health
University of South Carolina
Anwar T. Merchant, Director of Thesis
Robert Moran, Reader
Linda J. Hazlett, Reader
Lacy Ford, Vice Provost and Dean of Graduate Studies
Copyright by Vanessa L. Congdon, 2014
All Rights Reserved
This work is dedicated to my family and friends. Thank you all for believing in me and
continually encouraging me to achieve my dreams.
This thesis would not have been possible without the continued support and
guidance from a number of people. First I would like thank my committee chair, Dr.
Anwar Merchant for his knowledge, guidance, and flexibility to work with me from afar.
I am also indebted to my thesis committee members, Dr. Linda Hazlett and Dr. Robert
Moran, for their time, honest critiques, and willingness to guide me through the entire
process. I would also like to acknowledge my PPRNet mentors, Dr. Steven Ornstein and
Dr. Ruth Jenkins for providing me with the PPRNet data and for molding me into a true
My success as a student would not have been possible without the unwavering
support of my family and friends. A special thanks to my parents for their unconditional
love and support through my darkest of days during this long process. And lastly, thank
you to my biggest cheerleader and best friend, Jason, for your limitless support and for
making every day of this journey a lot more enjoyable.
A growing emphasis in the healthcare industry today is being placed on
demonstrating meaningful use of one's Electronic Health Record (EHR) system. As rates
of chronic disease, including diabetes mellitus (DM) rise, it has become clear that
accurate and timely disease surveillance could be greatly improved utilizing the
technologies available to clinicians today. As the Centers for Medicare and Medicaid
Services (CMS) meaningful use incentive program deadlines fast approach, it remains
unclear if their limited attestation criteria clearly reflect their end goal of improving
patient care. The objective of this research was to determine the diagnostic accuracy of
an automated text- based algorithm for identifying patients with diabetes mellitus from
the longitudinal PPRNet Database.
The longitudinal PPRNet database is comprised of McKesson's Practice
Partner, Lytec or Medisoft EHR system users nationwide. The analysis included data
from the 115 PPRNet practices that submitted their 4th quarter data extract in January
2014. An unstructured free-text algorithm was used to determine the number of type 2
diabetics among all active adult patients. This algorithm which examines unstructured
free-text data documented within the EHR title lines was compared to a previously
established protocol which used a combination of ICD-9 diagnostic codes and/or active
Between all algorithm comparisons, the patients identified as having diabetes
varied considerably. Using the combination of ICD-9 diagnostic codes and/or active DM
prescriptions as comparison method, the resulting sensitivity was 77.8% and specificity
was 97.2% for the free-text definition. Using diagnostic codes alone as the standard for
comparison resulted in a much higher sensitivity (99.3%), and lower specificity (91.9%).
However, when we compared the free-text definition to the ICD-9 diagnostic codes
alone, 70% of free-text identified cases were found to be un-coded.
As EHR use continues to rise, it is crucial that we continue to develop
ways to accurately translate patient data out of these systems in order to meaningfully
utilize these powerful technologies. This thesis has helped clarify the need for further
development of accurate data translation platforms in order to capture each patient's full
and unique health story as well as for monitoring treatment and outcomes all while
minimizing physician burden.
TABLE OF CONTENTS
DEDICATION . iii
ACKNOWLEDGEMENTS . iv
LIST OF TABLES . ix
LIST OF ABBREVIATIONS .x
CHAPTER I – Introduction .1
1.1 Statement of the Problem .1
1.2 Purpose and Objectives .4
1.3 Significance of Research .5
CHAPTER II – Literature Review .6
2.1 Diabetes Mellitus .6
2.2 U.S Healthcare's Transition to Electronic Health Record Systems .7
2.3 Data Structure .10
CHAPTER III – Methods .17
3.1 Study Design .17
3.2 Measurement .18
3.3 Statistical Analysis .21
CHAPTER IV – Results .23
4.1 Sample Characteristics .23
4.2 Sample Characteristics of Test Identified Diabetes Mellitus Population .23
4.3 Algorithm Evaluation: DM Prevalence, Sensitivity and Specificity .24
CHAPTER V – Discussion .27
5.1 Strengths of Study .28
5.2 Limitations of Study .28
5.3 Future Research .29
5.3 Conclusions .29
TABLE 2.1: Description of comparative studies that examine the reliability and validity of EHR derived algorithms for clinical quality measurement .14
TABLE 3.1: Drugs for treatment of Type 2 Diabetes Mellitus .20
TABLE 4.1: Sample Characteristics of PPRNet Population and Adults with Text-Identified Type 2 Diabetes Mellitus .25
TABLE 4.2: 2-year DM Prevalence among All Active Adult Patients in 115 PPRNet Practice Sites by Algorithm .26
TABLE 4.3: Sensitivity and Specificity of Unstructured Free-Text Algorithm Using Different Standards of Comparison .26
LIST OF ABBREVIATIONS
CDC………………………………………….Centers for Disease Control and Prevention
HITECH…………….Health Information Technology for Economic and Clinical Health
PBRN…………………………………………….Primary Care Based Research Network
PPRNet……………………………………………….Practice Partner Research Network
Statement of the Problem
Diabetes mellitus (DM) is one of the most prevalent, costly and burdensome,
chronic illnesses in the U.S, with nearly 10% of the entire population diagnosed with
diabetes and 35% with prediabetes. The American Diabetes Association predicts that as
many as 1 in 3 Americans will have diabetes by 2050 . As Americans become
increasingly plagued by diabetes, accurate and timely disease surveillance is becoming
increasingly important for clinicians, clinical researchers, policy makers and health plan
administrators. Historically, disease surveillance required manual review of paper charts
or large national surveys, both of which are time consuming and costly; however the
nationwide shift to electronic health records (EHR) provides the potential for a more
The Health Information Technology for Economic and Clinical Health (HITECH)
Act passed by the U.S Congress in 2009 is investing billions of dollars in incentives to
clinicians who can demonstrate meaningful use of their EHR systems over the next
several years. This act was set into motion with hopes of molding EHR's from data
graveyards into data warehouses. Ideally these warehouses will contain extractable,
secure, comprehensive, and standardized health information . Meaningful use
includes both a core set and a menu set of objectives that are specific to eligible
providers, hospitals and critical access hospitals (CAH). There are a total of 24
meaningful use objectives for eligible providers, and 23 objectives for eligible hospitals
and CAHs. To qualify for an incentive payment, 19 of these 24 or 18 of the 23 objectives
must be met. Due to the significant requirements for meaningful use attestation, the
program is divided into 3 stages for qualification. In the first stage of participation,
providers must demonstrate meaningful use for a 90-day EHR reporting period; in
subsequent stages, providers will demonstrate meaningful use for a full year EHR
reporting period. Programs are not required to demonstrate meaningful use in consecutive
years; however, there are deadlines for attesting to each stage. All hospitals and practices
that choose not to participate in the program will face reductions in Medicare
reimbursement rates .
The overarching goals of this meaningful use incentive program are to push the
U.S health care system to exploit and expand health information technology; however
this major overhaul presents many challenges to all parties involved. As the deadlines for
qualifying as a stage 2 meaningful use vendor quickly approach, EHR software
companies struggle to keep up, preventing proper usability assessments during
development . A certified stage 2 meaningful use EHR vendor must enable providers
to record data in a structured format, allowing for data to be more easily retrieved and
transferred, with hopes of optimizing health technology to improve patient care.
Meanwhile, practitioners continue to struggle with current insufficient interfaces, and
clinical researchers suffer from lacking standardized terminologies, yet both have little
say in future system developments . EHRs contain two types of data; structured,
coded data and, unstructured, free text data. Both types of data contain important
information about the patient's unique health story. Many providers find that entering
standardized data, rather than free text takes more time and effort. Some feel that current
software is lacking in standardized matches for many common chronic conditions .
West et al highlighted that the fragmentation of the US healthcare system hinders chronic
disease management as well as longitudinal research on these diseased populations.
Because patients see multiple providers in their lifetime, tracking a patient's care remains
extremely difficult . Researchers advise further validation on electronic database
extraction techniques before using them to assess quality of care .
Diabetes surveillance remains a top priority of the CDC, who developed and
maintains the world's first diabetes surveillance system. These surveillance data rely on
national and state-based household, telephone, and hospital-based surveys and vital
statistics to monitor diabetes trends. In collaboration with the NIH, the CDC has also
initiated the SEARCH for Diabetes in Youth study, the largest major surveillance system
to quantify and track the diabetes burden in Americans under 20 years of age. The
SEARCH study provides population-based information on the underlying factors, trends,
impact and level of care provided as well as allows researchers to clarify the degree to
which type 2 diabetes is affecting youth of different racial and ethnic backgrounds.
Overall, the CDC's surveillance data is used to understand the diabetes epidemic, identify
vulnerable at-risk populations, set prevention objectives and monitor successes of
programs over time, all at the national level.
Purpose and Objectives
The purpose of this thesis is to optimize methods for identification of patients
with type 2 diabetes mellitus (DM) from de-identified EHRs of primary care practices in
the Practice Partner Research Network (PPRNet). PPRNet is a practice based research
network (PBRN) that was established in 1995 as a collaborative effort between the
Department of Family Medicine at the Medical University of South Carolina (MUSC),
McKesson in Seattle, WA, and participating primary care or internal medicine practices
nationwide. The PPRNet database contains historical clinical data from 1987 through
2013 from 340 practices and more than 5 million patients. Currently PPRNet has 151
active member practices who electronically submit quarterly data extracts to PPRNet for
aggregation and analysis.
Our structured coded-data algorithm used for comparison was developed from the
previously established definition that Miller et al. used in 2004 to auto-identify DM
patients in the Department of Veteran Affairs database to calculate best estimates of DM
prevalence and incidence rates . Our unstructured text data algorithm uses a
developed data dictionary based on natural language processing to identify cases of DM
through evaluation of unstructured text data from the title lines within the EHR. This
thesis will test the diagnostic accuracy of the unstructured text algorithm in comparison
with Miller's identification protocol. The specific aims for this thesis are:
Specific Aim 1: Unstructured text data
• Identify cases of DM from de-identified EHR's of primary care practices
participating in PPRNet using developed algorithms based on natural
language processing to identify cases of DM through evaluation of
unstructured text data from the title lines within the EHR.
Specific Aim 2: Structured coded data
• Identify cases of DM from de-identified EHR's of primary care practices
participating in PPRNet using an algorithm established by Miller et al. that
assesses ICD-9 codes and diabetes medications from structured diagnostic
Specific Aim 3: Diagnostic accuracy
• Compare the unstructured text-based algorithm versus Miller's algorithm that
assesses ICD-9 codes and diabetes medication prescriptions for identifying
patients with diabetes.
Significance of Research
Specific aims of this thesis will assess the diagnostic accuracy of a new
unstructured text-based algorithm in comparison to an established structured code-based
algorithm. Several studies have been conducted to evaluate methods for estimating
disease prevalence or identifying high-risk patients from structured EHR data, or claims
data. Much existing research focuses on the use of automated data retrieval strategies to
assess quality of care, although a study comparing the data documented within structured,
coded fields with unstructured, narrative fields has yet to be performed. As the goals of
the meaningful use EHR incentive program continue to propel the U.S healthcare system
forward at a rapid rate, it's important to evaluate the current system operations in order to
monitor the impact these changes have on achieving desired long-term outcomes. This
thesis intends to not only present the diagnostic accuracy of this proposed diagnostic tool,
but also highlight the fundamental differences between data recorded in structured and
Prevalence of type 2 DM in the United States is increasing at a rapid rate, along
with it are health care costs, and other associated complications. From 1980 to 2011, the
crude prevalence of diagnosed diabetes rose 176% (from 2.5% - 6.9%) . The
American Diabetes Association (ADA) reported as of March 2013, 25.8 million (8.3%)
Americans have diabetes, listing 7.0 million of those as undiagnosed. The total annual
costs attributable to diabetes are estimated to be nearly 245 billion dollars, accounting for
20% of all health care expenditures in the U.S. Another 79 million Americans have
prediabetes, of which only 7.3% have been told by their physician . Prediabetes, also
commonly referred to as impaired glucose tolerance (IGT) or impaired fasting glucose
(IFG) almost always precedes the development of type 2 diabetes.
While risk factors such as genetics, ethnicity, birth weight and metabolic
syndrome certainly play a role in the development of diabetes, several controllable
lifestyle factors, such as one's weight, diet, exercise regimen and smoking status also
influence a person's probability of acquiring the disease. The ADA reported 85.2% of
people with type 2 diabetes are overweight or obese . Given the magnitude of this
problem, the U.S healthcare system needs accurate, automated data retrieval methods to
estimate and monitor its prevalence and evaluate the quality of care.
U.S Healthcare's Transition to Electronic Health Record Systems
Many large institutions nationwide have adopted EHR systems, while fewer small
clinics and primary care practices, who treat a majority of Americans, have integrated
health information technology (HIT) into their practices. Among these early adopters,
few properly utilized advanced features such as clinical decision support, point of care
alerts, patient activation, and overdue service reminder letter generation . While
clinical decision support has been shown to improve things like preventive care screening
rates among primary care doctors, an unintended inverse effect of alert fatigue has
surfaced when used too frequently 15). Lacking standard data definitions and
interoperability hinder nationwide implementation of comprehensive Personal Health
Records (PHR), highlighting the urgent need for clinical informatics . These patient
portals are currently utilized by less than 1% of the U.S population. The healthcare
system recognizes the potential these portals could have on stimulating patient
engagement. This platform would allow patients access to their personal health
information, as well as educational material and tools, empowering them to become
active participants in the management of their own health 18).
The U.S congress enacted the Health Information Technology for Economic and
Clinical Health (HITECH) Act as part of the American Reinvestment and Recovery Act
of 2009 to allow the Center for Medicare and Medicaid to provide incentives to clinicians
and hospitals who demonstrate meaningful use of their EHR system . The
requirements for participation gradually increase throughout the three stages, qualifying
providers that attest to each stage with significant incentive payments, and penalizing
those that don't successfully attest to stage two requirements at least three months before
the end of the 2014 payment year.
2.2.1 Electronic Health Records and Quality Clinical Care and Measurement
As clinicians across the country strive to earn these meaningful use incentives,
greater emphasis has been placed on the validity of current EHR-derived clinical quality
measures. Although the potential rewards are enormous, the accompanying challenges
should not be underestimated. Historically, clinical researchers, health plan
administrators and policymakers have relied on administrative, claims-based databases,
and self-report to deduce clinical context, often producing misleading results that
underestimate quality-of-care measures . Self-report has been shown to over-
estimate diabetes quality of care measures .
Claims databases were developed to collect insurance payments, not track clinical
information. Consequently, much relevant health information that is unnecessary for
processing payments may not be collected or recorded accurately. Pharmacy claims
often fail to identify chronic conditions like diabetes and hypertension that are being
controlled by diet alone . The comparison of claims with medical record data
produced complementary information on diabetes quality of care measures, resulting in
mixed reliability, the highest being microalbumin testing and the lowest agreement for
eye examination . A later study compared a claims-based strategy and an EHR-based
method with a manual review reference group in the identification of pharyngitis.
Overall, a larger proportion of cases were correctly identified by the EHR-based strategy
than the administrative data-based strategy. The administrative data-based strategy did
however boast a higher specificity than the EHR-based method, emphasizing the need for
more rigorously defined EMR-based retrieval strategies, before utilizing them for quality
of care measurement . In 2012, Ganz et al extracted structured coded data on falls in
the elderly, and compared it with manual review. He found that only 54% of falls were
identified within the coded data, and that much documentation regarding the care
surrounding each event was recorded in non-structured form. In conclusion, because the
accuracy of quality of care measures vary greatly between the types of care process being
evaluated, and prevent unique challenges, future validation studies comparing automated
algorithms to manual review will be beneficial .
2.2.2 Chronic Disease identification within the Electronic Health Record
Accurate chronic disease identification within the EHR is essential to surveillance
efforts, the development of patient care plans, and clinical research advancements.
Clinician documentation style remains the essential focus for improvement. Chronic
disease management often requires the coordination of many physicians. Due to
incongruent EHR systems, much treatment documentation from specialists fails to be
entered into the EHR utilized by the patient's primary care providers. Most information
that is relayed winds up in the free text portion of office notes, which automated searches
do not detect . Shifting to a more team-based care approach is necessary for
improved identification and care of chronic illness.
Strict algorithms for identification also prove to be important. In 2004, a study to
estimate DM rates over a three year period within the Department of Veterans Affairs
DEpic electronic database was conducted. This study compared varying combinations of
EHR derived DM criteria to self-reported DM cases. The algorithm with the highest
sensitivity (93%) and specificity (98%) used DM medication prescription records in the
current year and/or 2 diabetes codes from inpatient and/or outpatient visits (VA and
Medicare) over a 24 month period. When similar algorithms were applied to claims
databases in 2006, Solberg et al reported final positive predictive values (PPV) between
0.965 and 1.0. All algorithms were tested on a small sample population and then
adapted, producing a final algorithm with the following inclusion criteria; 2 or more
outpatient or 1 inpatient ICD-9 codes for diabetes within one year, or a filled prescription
for diabetes-specific medication in the same calendar year. After initial chart review,
Metformin was found to be used to treat other conditions, such as polycystic ovary
syndrome, infertility and reactive hyperglycemia, and was removed as a diabetes-specific
medication from the final algorithm .
The type of data contained in an EHR can be classified into one of two types;
structured, coded data, or, unstructured, free-text data. Much recent research has focused
on comparing the type of data stored in each form and its relation to clinical quality
measurement. The meaningful use incentive program has identified many of the
limitations in using unstructured data for these purposes, thus encouraging clinicians to
document in structured, coded formats in order to attest in both stage 2 and stage 3.
Many structured fields successfully capture all relevant information needed for some
quality measures, such as blood pressure recorded in vital signs for hypertension
measures . Although, much of the literature suggests that the completeness of the
medical records and ease of extractability vary greatly depending on the clinical area of
focus . The literature referenced in the following sections present the positive and
negative attributes of both data types.
2.3.1 Unstructured Data
Unstructured, narrative text provides unique insight into the quality of care
because it represents a provider's thought process, unrestricted by structured
vocabularies. This extensive narrative data is made valuable through the use of natural
language processing (NLP). Most challenges in NLP arise in the process of deriving
meaning from human or natural language input. Although NLP continues to improve,
recall and precision rates vary significantly between systems. Narrowly and consistently
defined variables, such as gender, race and test results tend to demonstrate the highest
rates of both, while variables with multiple definitions remain difficult to capture and
Studies that have only evaluated structured data fields have regularly stated that
the algorithms missed recognition because relevant information, such as exclusion
criteria, was only documented in narrative form . Another study found that their NLP
system consistently out-performed the use of ICD-9 billing codes in identifying the
condition of interest . Overall, the condition of interest being evaluated has the
largest impact on NLP results.
Existing literature highlights the limitations associated with manual review, the
use of administrative data, EHR data structure and format, and extraction procedures
. One major issue with auto-extracted data stems from under recording in
reasonably accessible fields such as medication lists . This type of automated
recognition software has been applied to discharge summaries, radiology reports, and
other qualitative data from limited sections of the patient's EHR resulting in a validity
ranging from low to high . When used in combination with ICD-9 codes, Zeng et
al found that accuracy improved. NLP systems have been shown to accurately identify
risk factors and diagnostic criteria associated with certain medical conditions. Byrd et al
successfully developed NLP algorithms using Framingham criteria for early detection of
heart failure patients .
2.3.2 Structured Data
Structured, coded data allows for interoperability between systems. This type of
data eases the accuracy for secondary use purposes. Readily available and directly
analyzable EHR data reduces the need for extensive manual chart review, thus allowing
for performance measures to be more easily assessed on a larger proportion of patients in
care. When structured data was compared with full chart review results from the
Veterans Health Administration's External Peer Review Program (EPRP) on several
measures, over 80% of the data on these selected measures was found in a directly
analyzable format within the EHR. While the EPRP data were found to be more
complete, the correlation of measures between sources was very high (0.89-0.98) .
Much focus been placed on standardizing EHR output, while very little emphasis,
until recently has been aimed at standardizing EHR data inputs. All clinicians are
initially trained on proper documentation techniques in their EHR training. These
techniques are often reinforced by quality improvement specialists; however no
mechanism within the EHR forces providers to document in a particular location in the
chart. Intensive training, automatic prompts and proper feedback are necessary in
standardizing their documentation habits to reflect the care given in EHR-derived quality
Even standardized data comes with drawbacks. Botsis et al found much
inaccuracy within coded data. Often times a non-specific ICD-9 code is selected, such as
250 for diabetes, when a more accurate diagnosis is actually made at the point of care.
Inconsistencies within the data also prove to be troublesome, sometimes displaying both
250.01 and 250.02 for type-1 and type-2 diabetes respectively. He also highlights the
lack of contextual information the current ICD-9 coding system supports .
Table 2.1: Description of Comparative Studies that examine the Reliability and Validity of EHR derived Algorithms for Clinical Quality Measurement
Baker et al.,
Automated review of the EHR was comparable to
failure patient with
manual review for Left ventricular ejection fraction
2 or more clinic
(LVEF) measurement (94.6% vs. 97.3%), prescription of
visits within the 18
beta blockers (90.9% vs. 92.8%), and prescription of
ACE inhibitors or ARBs (93.9% vs. 98.7%). Performance was lower for prescription of warfarin for atrial fibrillation (70.4% vs. 93.6%).
Baldwin et al.,
N= 60; Women ≥
A significant difference between Natural Language
40 years structured
Processing (NLP) methods and manual review was
found. The NLP method found a false positive rate of 0,
and a false negative rate of .035.
Health Center in 2001
Benin et al.,
N= 479; possible
When comparing each group to the reference; 91% of
EMR-based strategy episodes were confirmed and 59%
of the administrative data-based strategy.
analyzed using; (1.) EMR-based, (2.) administrative data-based, and (3.) manual review reference strategies
Fowles et al.,
Cross-sectional Reliability between primary medical record and claims
with Diabetes, aged
varied by measure; Eye examination (K= 0.371), Oral
agents(K= 0.699), Insulin (K= 0.548), HbA1c (K=
0.678) and Microalbumin (K= 0.748)
Ganz et al.,
N=215; Falls data
A structured visit note was found in 54% of charts
within 3 months of the date patients had been identified
as falling. The reliability of the codable-data algorithm
initiative in primary
was good (K=0.61) compared with full medical record
care medical groups
review for three care processes.
Goulet et al.,
VA patients with
Over 80% of the selected measures were found in
directly analyzable form within the EMR. The degree of
correlation between automated algorithms assessing
structured fields in comparison to the Veterans Health
Administration's External Peer Review Program(EPRP) was high (0.89-0.98).
Hivert et al.,
Directly measured EHR-defined MetS had 73%
adult patients from
sensitivity and 91% specificity. DM incidence was 1.4%
in the No MetS group vs. 4% in the At-Risk-for-MetS
practices in eastern
Miller et al.,
The most accurate criterion was a prescription for
diabetes medication in the current year and/or 2 +
patients recorded in
diabetes codes from inpatient and/or outpatient visits
(VA and Medicare) over a 24-month period (Se= 93%
and Sp= 98%) against patient self-report.
Owen et al.,
The percent agreement between automated algorithms
sample of inpatient
and manual review among patients with chlorpromazine
and outpatient visits
equivalents < 300, 300-1,000, and > 1,000, are .11, .41,
and .21, respectively for inpatients, and .19, .21 and .40
patients from the
for outpatients. The overall weighted Kappa for
inpatients (K=0.55) and outpatients (K= 0.63).
Administration database (VistA)
Parsons et al.,
The majority of diagnoses for chronic conditions had
EHR records from
information documented in the problem list (a structured
field) and were recognized by the automated quality measures, including diabetes (>91.4% across measures), hypertension (89.3%), ischemic cardiovascular disease (>78.8% across measures) and dyslipidemia (75.1%).
Persell et al.,
N=1,006; All CAD
Performance on 7 quality measures varied from 81.6%
for lipid measurement to 97.6% for blood pressure
measurement. After including Free-text data, the
adherence rate increased, ranging from 87.5% for lipid measurement and low-density lipoprotein cholesterol to 99.2% for blood pressure measurement.
We used a cross-sectional study of diagnostic accuracy design, analyzing data
from the longitudinal PPRNet database. PPRNet was established in 1995 as a
collaborative effort between the Department of Family Medicine at the Medical
University of South Carolina (MUSC), Practice Partner/McKesson in Seattle, WA and
participating primary care and internal medicine practices. PPRNet is a practice based
research network (PBRN) that strives to improve the quality of healthcare in its member
practices by; turning clinical data into actionable information, empirically testing
theoretically sound quality improvement interventions, and disseminating successful
interventions to primary care providers across the country. Currently PPRNet has 151
physician practices, representing over 1068 health care providers, and approximately 1.4
million patients located in 38 states. All of PPRNet's member practices currently use
McKesson's Practice Partner, Lytec or Medisoft's EHR systems. These data are
extracted and sent to PPRNet on a quarterly basis. Data are then cleaned, appended to the
longitudinal database and analyzed to produce quality improvement reports on 65 clinical
quality measures (CQM). These quality measures include ten diabetes mellitus measures
and track the quality of care on several other common conditions such as cardiovascular
disease, respiratory disease with other focuses on women's health, cancer screening,
immunizations, mental health, substance abuse, and medication safety.
3.1.2 Study population
This eligible patient population was comprised of active patients from 115
PPRNet practices that sent their fourth quarter data extract in January 2014. A patient
was defined as active if he/she had a visit within 1 year and was not designated with a
deceased or inactive status. A visit was determined by a progress note title that did not
include text indicating a cancelled appointment or no show. Similarly, in either
approach, the recorded data must not be designated with an inactive status or a resolved
3.1.3 Inclusion and exclusion criteria
The electronic health record of all active patients ≥ 18 years of age were evaluated
for an active diagnosis of type 2 diabetes mellitus made within the last 2 years.
The aims of this study were to assess DM diagnosis in a database of electronic
medical records using 3 methods: NLP, Miller's protocol, and ICD-9 codes. NLP is a
newer method that uses an algorithm based on unstructured text data, while the other two
methods have been used in the past.
3.2.1 Unstructured text evaluation
The unstructured text algorithm utilizes NLP techniques for automated
identification of diagnoses. We first developed common text variations of DM, including
full diagnosis names, ICD-9 codes, abbreviations, synonyms, and common misspellings.
These 341 text string variations were then compared to the free text data, flagging
possible diagnoses of type 2 DM and suggesting a corresponding ICD-9 code. All
flagged diagnoses with a frequency of 4 or more were then manually reviewed by a
research assistant for correctness. Text strings were then either classified as definite
diagnoses of type 2 DM, or excluded from future analysis. These text string
classifications were then reviewed by a clinician for accuracy. This review process is
conducted on a quarterly basis. Each quarter, only new text variations, with a frequency
greater than 3 are flagged for manual review. Currently, the PPRNet database contains
13,231 text variants included as DM.
3.2.2 Structured data evaluation
The coded, structured data evaluation algorithm we used is based on Miller's
definition for DM identification in a VA population [Miller 2004]. This criterion
included a prescription for a diabetes medication in the current year and/or 2 or more
recorded type 2 diabetes ICD-9 diagnostic codes within a 24-month period. As of
January, 2014, the PPRNet database contained data through December 31, 2013 from
115A practices. The DM codes included for analysis were comprised of the following
ICD-9 codes; 250(excluding type 1 codes), 357.2, 362.01, 362.02, 366.41. These were
extracted from the 4 code fields within the EHR. The medications included for DM
treatment will be taken from the most current Treatment Guidelines from The Medical
Letter. The DM medications included in the analysis are listed in Table 2 .
Table 3.1: Drugs for Treatment of Type 2 Diabetes Mellitus
500,850,1000 mg tabs
500,850,1000 mg tabs
extended- release – generic
500, 750 mg tabs
500, 750 mg tabs
500, 1000 mg tabs
500, 1000 mg tabs
500 mg/ 5 mL (4, 16 oz)
Second- Generation Sulfonylureas
Glimepiride – generic
Glipizide – generic
extended- release – generic
2.5, 5, 10 mg tabs
Glyburide – generic
1.25, 1.5, 2.5, 3, 5, 6 mg tabs
1.25, 2.5, 5 mg tables
1.25, 2.5, 5 mg tabs
micronized tablets – generic
1.5, 3, 4.5, 6 mg tabs
1.5, 3, 6 mg tabs
Nateglinide – generic
Repaglinide -- Prandin
0.5, 1, 2 mg tabs
Pioglitazone – Actos
15, 30, 45 mg tabs
Rosiglitazone -- Avandia
Acarbose – generic
25, 50, 100 mg tabs
25, 50, 100 mg tabs
25, 50, 100 mg tabs
Sitagliptin -- Januvia
25, 50, 100 mg tabs
Saxagliptin -- Onglyza
Linagliptin -- Tradjenta
Exenatide – Byetta
250 mcg/mL (1.2, 2.4 mL
Liraglutide – Victoza
6 mg/mL (3 mL prefilled pen)
Colesevelam – Welchol
Bromocriptine – Cyclose
Pramlintide -- Symlin
1000 mcg/mL (1.5, 2.7 mL
Metformin/glipizide – generic
1000 mcg/mL (1.5, 2.7 mL
500/15, 850/15 mg tabs
500/15, 850/15 mg tabs
Actoplus Met XR
1000/15, 1000/30 mg tabs
Metformin/repaglinide – Prandimet
500/1, 55/2 mg tabs
Metformin/rosiglitazone – Avandamet
500/2, 55/4, 1000/2, 1000/4
Glimepiride/rosiglitazone – Anandryl
1/4, 2/4, 4/4, 2/8, 4/8 mg tabs
Glimepiride/pioglitazone – Duetac
2/30, 4/30 mg tabs
Metformin/sitagliptin -- Janumet
500/50, 1000/50 mg tabs
Metformin/saxagliptin -- Kombiglyze
500/5, 1000/2.5, 1000/5 mg
Statistical analysis was performed using SAS software version 9.2 (SAS Institute,
Cary, NC). The number of type 2 DM cases was calculated using both algorithms
(described above), as well as an algorithm that evaluated ICD-9 diagnostic codes, alone.
The accuracy of the unstructured text algorithm was compared to Miller's approach as
well as the ICD-9 diagnostic code algorithm by calculating sensitivity and specificity.
The unstructured text algorithm was used to calculate the 2-year prevalence of DM in
PPRNet. Rates are presented overall and in population subsets defined by patient
characteristics: age, sex, body mass index (BMI), as well as practice characteristics,
including; practice type, being either internal medicine or family practice, a mix of both,
multi-specialty, or "other".
There were a total of 368,384 active adult patients among the 115 practices who
sent their 4th quarter data extracts to PPRNet in January 2014 (Table 3). More than half
of the population was female (57.5%). Within the sample, 36.6% were aged 18-44 years
old, 18.6% were 45-54 years old, 19.5% were 55-64 years old, 13.9% were 65-74 years
old, 7.6% were 75-84 years old, and 3.2% were 85-108 years old. Nearly a quarter of the
population was underweight/normal weight (24.7%), while 29.8% were overweight, and
38.9% were obese. A majority of PPRNet practices are family practices, accounting for
70.5% of the patient sample. The majority of remaining patients belong to internal
medicine practices (17.1%). A small sample of patients belongs to mixed practices made
up of both family practitioners and internists. Rounding out the sample are multispecialty
practices (2.6%), and "other" which consists of Rheumatology, Pulmonary, Gynecology,
Neurology, Urology and Pediatric practices (4.5%).
Sample Characteristics of Text-identified Diabetes Mellitus Population
Just over half of adult diabetics are female (51.1%). The percentage of diabetics
increases with age before leveling off at age 74 and declining thereafter. As expected,
most of these type-2 diabetics fell in the overweight (23.7%) or obese (63.0%) BMI
categories. Less than 10% of PPRNet's diabetic patients are underweight (0.8%) or
normal weight (8.6%). The DM patient sample was representative of the full population
in regards to practice type as displayed in Table 3.
Algorithm Evaluation: DM Prevalence, Sensitivity and Specificity
Table 4 presents 2-year DM prevalence estimates based on each of the three
algorithms (detailed description provided above in Section 3.2). Both the unstructured
free-text algorithm and Miller's algorithm produced the same prevalence (11.1%), while
the ICD-9 diagnostic code algorithm identified far fewer cases of DM, resulting in a
prevalence of 3.4%.
Between all algorithm comparisons, the patients identified as having diabetes
varied considerably. When we compared the unstructured free-text algorithm to Miller's,
each protocol found close to 10,000 patients that were missed by the opposing definition.
Using Miller's protocol as the standard of comparison, the resulting sensitivity was
77.8% and specificity was 97.2%. However, when we compared the free-text definition
to the ICD-9 diagnostic codes alone, 70% of free-text identified cases were found to be
un-coded. Only 86 additional patients had 2 or more recoded ICD-9 diagnostic codes but
were not identified using the free-text algorithm. All 86 cases identified by the code
definition alone were due to the low frequency of the corresponding text string. As
described in detail in the methodology, only those unstructured text diagnoses that occur
4 or more times within the data are included for review to be counted as a definite
diagnosis of DM. Using diagnostic codes alone as the standard for comparison resulted
in a much higher sensitivity (99.3%), and lower specificity (91.9%).
Table 4.1: Sample Characteristics of PPRNet Population and Adults with Text-Identified Type 2 Diabetes Mellitus
All Adult patients (≥18)
Overall Number and DM Prevalence
Underweight (< 18.5)
Family Practice/Internal Medicine
Table 4.2: 2-year DM Prevalence among All Active Adult Patients in 115 PPRNet Practice Sites by Algorithm
Active medication prescription and/or 2+
ICD-9 codes recorded within the previous 2 years
Active text diagnoses recorded in unstructured
title lines within previous 2 years
ICD-9 diagnostic codes:
2+ ICD-9 diagnostic code recorded within
previous 2 years
Table 4.3: Sensitivity and Specificity of Unstructured Free-Text Algorithm Using Different Standards of Comparison
Standard of Comparison
ICD-9 diagnostic codes
The first aim of this study was to replicate, in PPRNet, the best definition for
automated DM identification within EHR data from Miller's 2004 study comparing
various definitions for DM identification using the Department of Veteran Affairs
electronic health record database. We found that while the same overall percentage of
diabetic patients were identified using this method as compared to the free-text method,
there were several thousand diagnoses that had clear evidence of a free-text diagnoses
that were missing a corresponding diagnostic code, and that were not on an active
prescription for a DM medication. Similarly, there were close to the same number of
diabetic patients identified by Miller's definition alone when compared to the free-text
algorithm. Miller's best definition includes an active prescription for DM recorded
within the last year, or 2 or more ICD-9 diagnostic codes recorded within the last 2 years.
One of the main limitations of this definition is that some commonly used medications
for DM, such as Metformin, which is the first-line drug of choice for the treatment of
type 2 diabetics who are overweight or obese and with normal kidney function is also
used in the treatment of polycystic ovary syndrome and other diseases where insulin
resistance may be an important factor.
Secondly, this paper aimed to test a newly developed unstructured free-text based
algorithm in accurate identification of DM cases within an active PPRNet patient
population. One overarching limitation was due to our inability to access and manually
review each individual patient record, leaving us with no true gold standard for
comparison. We chose Miller's definition because it had been found to be quite accurate
when compared to patient survey. Using this standard of comparison, the free-text
definition resulted in a fair sensitivity and very good specificity. Although we did not
manually review each patient record, each unique text string with a frequency of 4 or
more that was flagged for review using our automated DM text string dictionary
consisting of 341 unique and comprehensive text strings was reviewed by a trained
research assistant. Text diagnoses that were unclear were then also reviewed by a
physician. While we cannot say with certainty that all cases of DM identified using the
text algorithm is an actual case of DM, we are very confident that the rate of
misclassification is very low due to this extensive processing. After comparing our
algorithm with ICD-9 diagnostic codes alone, it also appears that we are missing very
few coded cases of DM, resulting in a very high sensitivity (99.3%) and specificity
(91.9%). Several more cases were identified when adding prescriptions for DM to the
definition, but as we previously stated, we cannot be sure that the medication is being
used to treat DM.
Strengths of the Study
A major strength of this study is the large sample size. This sample represents the
differing documentation styles of hundreds of physicians nationwide treating hundreds of
thousands of patients in both urban and rural practice settings.
Limitations of the Study
PPRNet has very little variation in practice type and practice size, consisting of
mostly small to mid-size family practices and internal medicine clinics. Another
limitation is the fact that all PPRNet practices use one common EHR software product in
an ever growing market place of products with varying configurations. Lastly, we did not
compare our free-text based algorithm with a gold standard (physician diagnosis)
preventing the estimation of its sensitivity and specificity. However, the development of
the NLP algorithm is an iterative process. After a query is used to identify diabetes cases,
a physician reviews the cases that the query identifies for accuracy. The query is then
modified and the process is repeated. This happens on an ongoing basis. This rather
efficient NLP algorithm was used to identify cases in this study.
We recommend that similar studies in the future use databases that contain data
from several EHR software systems to reduce bias. It would be interesting to replicate
this study in a more diverse research network; stratifying by practice site characteristics
such as size, location and specialty as well as provider characteristics such as degree and
specialty. In looking at both practice and provider characteristics, we could get a better
understanding of what major factors influence physician EHR documentation styles. It
would also be useful to attain patient records for manual chart review to use as a gold
standard for comparison when testing new algorithms that could potentially aid in a
variety of arena's such as population health. In a similarly large research network, one
could collect a randomized sample of a small percentage of the total population rather
than manually review the charts of the entire population.
Our unstructured free-text evaluation performed quite well in accurately
identifying Type 2 DM patients within the PPRNet active patient population. As EHR
use is on the rise, it is crucial that we continue to develop ways to accurately translate
patient data out of these systems in order to meaningfully utilize these powerful
technologies. This paper has helped clarify the need for further development of accurate
data translation platforms in order to capture each patient's full and unique health story as
well as for monitoring treatment and outcomes all while minimizing physician burden.
FAST FACTS Data and Statistics about Diabetes. In. 3/1/2013 ed: American
Diabetes Association; 2013. p. 2. 2.
Holmes C. The problem list beyond meaningful use. Part I: The problems with
problem lists. J AHIMA 2011;82(2):30-3; quiz 34. 3.
Prokosch HU, Ganslandt T. Perspectives for medical informatics. Reusing the
electronic medical record for clinical research. Methods Inf Med 2009;48(1):38-44. 4.
EHR Incentive Programs: Meaningful Use. In: Centers for Medicare and
Medicaid Services; 2013. 5.
Lobach DF, Detmer DE. Research challenges for electronic health records. Am J
Prev Med 2007;32(5 Suppl):S104-11. 6.
Richesson RL, Krischer J. Data standards in clinical research: gaps, overlaps,
challenges and future directions. J Am Med Inform Assoc 2007;14(6):687-96. 7.
West S, Blake C, Zhiwen L, McKoy J, Oertel M, Carey T. Reflections on the use
of electronic health record data for clinical research. Health Informatics Journal 2009;15(2):108-21. 8.
Benin AL, Vitkauskas G, Thornquist E, Shapiro ED, Concato J, Aslan M, et al.
Validity of using an electronic medical record for assessing quality of care in an outpatient setting. Med Care 2005;43(7):691-8. 9.
Miller DR, Safford MM, Pogach LM. Who has diabetes? Best estimates of
diabetes prevalence in the Department of Veterans Affairs based on computerized patient data. Diabetes Care 2004;27 Suppl 2:B10-21. 10.
Diabetes Data & Trends: Crude and Age-adjusted Percentage of Civilian Non-
institutionalized Adults with Diagnosed Diabetes, United States, 1980-2011. In. Atlanta, GA: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention; 2011. 11.
Goetz Goldberg D, Kuzel AJ, Feng LB, DeShazo JP, Love LE. EHRs in primary
care practices: benefits, challenges, and successful strategies. Am J Manag Care 2012;18(2):e48-54. 12.
Greiver M, Barnsley J, Glazier RH, Moineddin R, Harvey BJ. Implementation of
electronic medical records: effect on the provision of preventive services in a pay-for-performance environment. Can Fam Physician 2011;57(10):e381-9. 13.
O'Connor PJ, Crain AL, Rush WA, Sperl-Hillen JM, Gutenkauf JJ, Duncan JE.
Impact of an electronic medical record on diabetes quality of care. Ann Fam Med 2005;3(4):300-6. 14.
Harrison MI, Koppel R, Bar-Lev S. Unintended consequences of information
technologies in health care--an interactive sociotechnical analysis. J Am Med Inform Assoc 2007;14(5):542-9.
DeJesus RS, Angstman KB, Kesman R, Stroebel RJ, Bernard ME, Scheitel SM, et
al. Use of a clinical decision support system to increase osteoporosis screening. J Eval Clin Pract 2010;18(1):89-92. 16.
Katzan IL, Rudick RA. Time to integrate clinical and research informatics. Sci
Transl Med 2012;4(162):162fs41. 17.
Tang PC, Lansky D. The missing link: bridging the patient-provider health
information gap. Health Aff (Millwood) 2005;24(5):1290-5. 18.
Nagykaldi Z, Aspy CB, Chou A, Mold JW. Impact of a Wellness Portal on the
delivery of patient-centered preventive care. J Am Board Fam Med 2012;25(2):158-67. 19.
Blumenthal D, Tavenner M. The "meaningful use" regulation for electronic health
records. New England Journal of Medicine 2010;363(6):501-4. 20.
Pawlson LG, Scholle SH, Powers A. Comparison of administrative-only versus
administrative plus chart review data for reporting HEDIS hybrid measures. Am J Manag Care 2007;13(10):553-8. 21.
Tang PC, Ralston M, Arrigotti MF, Qureshi L, Graham J. Comparison of
Methodologies for Calculating Quality Measures Based on Administrative Data versus Clinical Data from an Electronic Health Record System: Implications for Performance Measures. Journal of the American Medical Informatics Association 2007;14(1):10-15. 22.
Fowles JB, Rosheim K, Fowler EJ, Craft C, Arrichiello L. The validity of self-
reported diabetes quality of care measures. Int J Qual Health Care 1999;11(5):407-12. 23.
Rector TS, Wickstrom SL, Shah M, Thomas Greeenlee N, Rheault P, Rogowski J,
et al. Specificity and sensitivity of claims-based algorithms for identifying members of Medicare+Choice health plans that have chronic medical conditions. Health Serv Res 2004;39(6 Pt 1):1839-57. 24.
Ganz DA, Almeida S, Roth CP, Reuben DB, Wenger NS. Can structured data
fields accurately measure quality of care? The example of falls. J Rehabil Res Dev 2012;49(9):1411-20. 25.
Roth CP, Lim YW, Pevnick JM, Asch SM, McGlynn EA. The challenge of
measuring quality of care from the electronic health record. Am J Med Qual 2009;24(5):385-94. 26.
Persell SD, Wright JM, Thompson JA, Kmetik KS, Baker DW. Assessing the
validity of national quality measures for coronary artery disease using an electronic health record. Arch Intern Med 2006;166(20):2272-7. 27.
Solberg LI, Engebretson KI, Sperl-Hillen JM, Hroscikoski MC, O'Connor PJ. Are
claims data accurate enough to identify patients for performance measures or quality improvement? The case of diabetes, heart disease, and depression. Am J Med Qual 2006;21(4):238-45. 28.
Borzecki AM, Wong AT, Hickey EC, Ash AS, Berlowitz DR. Can we use
automated data to assess quality of hypertension care? Am J Manag Care 2004;10(7 Pt 2):473-9. 29.
Weiskopf NG, Hripcsak G, Swaminathan S, Weng C. Defining and measuring
completeness of electronic health records for secondary use. J Biomed Inform 2013. 30.
Baldwin KB. Evaluating healthcare quality using natural language processing. J
Healthc Qual 2008;30(4):24-9.
Baker DW, Persell SD, Thompson JA, Soman NS, Burgner KM, Liss D, et al.
Automated review of electronic health records to assess quality of care for outpatients with heart failure. Annals of Internal Medicine 2007;146(4):270-7. 32.
Pakhomov SS, Hemingway H, Weston SA, Jacobsen SJ, Rodeheffer R, Roger
VL. Epidemiology of angina pectoris: role of natural language processing of the medical record. Am Heart J 2007;153(4):666-73. 33.
Chan KS, Fowles JB, Weiner JP. Review: electronic health records and the
reliability and validity of quality measures: a review of the literature. [Review]. Medical Care Research & Review 2010;67(5):503-27. 34.
Parsons A, McCullough C, Wang J, Shih S. Validity of electronic health record-
derived quality measurement for performance monitoring. J Am Med Inform Assoc 2011. 35.
Tu K, Mitiku T, Lee DS, Guo H, Tu JV. Validation of physician billing and
hospitalization data to identify patients with ischemic heart disease using data from the Electronic Medical Record Administrative data Linked Database (EMRALD). Canadian Journal of Cardiology 2010;26(7):e225-8. 36.
Owen RR, Thrush CR, Cannon D, Sloan KL, Curran G, Hudson T, et al. Use of
electronic medical record data for quality improvement in schizophrenia treatment. J Am Med Inform Assoc 2004;11(5):351-7. 37.
Chapman WW, Fizman M, Chapman BE, Haug PJ. A comparison of
classification algorithms to automatically identify chest X-ray reports that support pneumonia. J Biomed Inform 2001;34(1):4-14. 38.
Denny JC, Peterson JF, Choma NN, Xu H, Miller RA, Bastarache L, et al.
Extracting timing and status descriptors for colonoscopy testing from electronic medical records. J Am Med Inform Assoc 2010;17(4):383-8. 39.
Hripcsak G, Friedman C, Alderson PO, DuMouchel W, Johnson SB, Clayton PD.
Unlocking clinical data from narrative reports: a study of natural language processing. Ann Intern Med 1995;122(9):681-8. 40.
Melton GB, Hripcsak G. Automated detection of adverse events using natural
language processing of discharge summaries. J Am Med Inform Assoc 2005;12(4):448-57. 41.
Zeng QT, Goryachev S, Weiss S, Sordo M, Murphy SN, Lazarus R. Extracting
principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC Med Inform Decis Mak 2006;6:30. 42.
Jain NL, Knirsch CA, Friedman C, Hripcsak G. Identification of suspected
tuberculosis patients based on natural language processing of chest radiograph reports. Proc AMIA Annu Fall Symp 1996:542-6. 43.
Byrd RJ, Steinhubl SR, Sun J, Ebadollahi S, Stewart WF. Automatic
identification of heart failure diagnostic criteria, using text analysis of clinical notes from electronic health records. Int J Med Inform 2013. 44.
Goulet JL, Erdos J, Kancir S, Levin FL, Wright SM, Daniels SM, et al. Measuring
performance directly using the veterans health administration electronic medical record: a comparison with external peer review. Med Care 2007;45(1):73-9. 45.
Botsis T, Hartvigsen G, Chen F, Weng C. Secondary Use of EHR: Data Quality
Issues and Informatics Opportunities. AMIA Summits Transl Sci Proc 2010;2010:1-5. 46.
Treatment Guidelines from the Medical Letter. The Medical Letter, Inc
OMBRE DEL MEDICAME TO Leflunomida medac 20 mg comprimidos recubiertos con película 2. COMPOSICIÓ CUALITATIVA Y CUA TITATIVA Cada comprimido recubierto con película contiene 20 mg de leflunomida. Excipiente(s) con efecto conocido: Cada comprimido recubierto con película contiene 152 mg de lactosa (como monohidrato) y 0,12 mg de lecitina de soja. Para consultar la lista completa de excipientes, ver sección 6.1. 3.
World Transport Policy & Practice Vol ume 4, Num ber 1, 1998 Abstracts & keywords Dutch Transport Policy: From Rhetoric to RealityGary Haq and Machiel Bolhuis Urban Transport and Equity: the case of São PauloEduardo A. Vasconcel os Sustainable Transport: Some challenges for Israel and PalestineYaakov Garb