Use of natural language processing, such as in the form of free-text searches of electronic medical records of clinical and progress notes of patients performed better at identifying post-operative surgical complications than the commonly used administrative data codes in EMRs, according to a study in the August 24/31 issue of JAMA.

To improve on identifying patient safety concerns, the Agency for Healthcare Research and Quality developed a set of 20 measures, known as the patient safety indicators, which use administrative data to screen for potential adverse events that occur during hospitalization, according to background information in the article. "Currently, most automated methods to identify patient safety occurrences rely on administrative data codes; however, free-text searches of electronic medical records could represent an additional surveillance approach," the authors write.

"The development of automated approaches, such as natural language processing, that extract specific medical concepts from textual medical documents that do not rely on discharge codes, offers a powerful alternative to either unreliable administrative data or labor-intensive, expensive manual chart reviews. Nevertheless, there have been few studies investigating natural language processing tools for the detection of adverse events. It is not known whether a surveillance approach based on language processing searches of free-text documents will perform better than currently used tools based on administrative data."

Harvey J. Murff, M.D., M.P.H., of the Veterans Affairs Medical Center and Vanderbilt University, and colleagues, conducted a study to evaluate a language processing-based approach. Among the outcomes measured were post-operative occurrences of acute renal failure requiring dialysis, deep vein thrombosis, pulmonary embolism, sepsis, pneumonia, or heart attack identified through medical record review as part of the VA Surgical Quality Improvement Program. The researchers determined the sensitivity and specificity of the natural language processing approach to identify these complications and compared its performance with patient safety indicators that use discharge coding information.

The researchers found that, in general, using a natural language processing-based approach had higher sensitivities and lower specificities than did the patient safety indicator. "The increase in sensitivity of the natural language processing-based approach, compared with the patient safety indicator, was more than two-fold for acute renal failure and sepsis and over 12-fold for pneumonia. Specificities were four percent to seven percent higher with the patient safety indicator method than the natural language processing approach."

"Natural language processing correctly identified 82 percent of acute renal failure cases compared with 38 percent for patient safety indicators. Similar results were obtained for venous thromboembolism (59 percent vs. 46 percent), pneumonia (64 percent vs. five percent), sepsis (89 percent vs. 34 percent), and post-operative myocardial infarction (91 percent vs. 89 percent). Both natural language processing and patient safety indicators were highly specific for these diagnoses."

Ashish K. Jha, M.D., M.P.H., of the Harvard School of Public Health, Boston, writes in an accompanying editorial that, "although the promise of natural language process is substantial, its benefits will not be realized without considerable new investment in research and development.

"Murff and colleagues focused on one specific application of identifying adverse events after surgery. Dozens of permutations and combinations of syntax were tested and customized to identify the optimal strategy for finding complications in an electronic health record. To realize the benefits of natural language process, this kind of research will need further development not only to find better algorithms but also to investigate EHR analysis for disciplines other than surgery and optimize automated EHR searches for different types of clinicians."