Discovery and visualization of new information from clinical reports
Funding Agency: AHRQ R01 HS022085 (Melton-Meaux)
Project Dates: 09/30/2013–09/29/2017
This grant develops and evaluates visualization methods by “highlighting” important information from clinical texts, improving user interface design for clinical texts, and conducts a prospective clinical trial with a tool in the EHR to highlight new, non-redundant information in clinical documents.
Natural language processing for clinical and translational research
Funding Agency: NIH/NIGMS R01 GM102282 (Liu/Pakhomov/Xu)
Project Dates: 04/01/2013–03/31/2017
The overall goal of this project is to develop a novel framework to enable the use of clinical information embedded in clinical narratives for clinical and translational research.
Leveraging the EHR to collect and analyze social, behavioral & familial factors
Funding Agency: NIH/NLM R01 LM011364 (Chen/Melton)
Project Dates: 09/01/2012–08/31/2017
The overall goal of this project is to develop and evaluate computational methods for generating knowledge regarding the relationships between diseases and social, behavioral, and familial factors.
University of Minnesota Clinical and Translational Science Institute (CTSI)
Funding Agency: NIH/NCRR U54 RR026066 (Blazar)
Project Dates: 06/01/2011–05/31/2016
The major goals of this infrastructure award are to support clinical and translational research at the University of Minnesota to transform research processes within the institution and community. The NLP-IE group is developing an NLP platform for use by clinical researchers as a resource.
Research areas of interest
Automated sentiment and topic analysis of medical training evaluation text
Medical post-graduate residency training and other aspects of medical training increasingly utilize electronic systems to evaluate trainee performance based on defined training competencies with quantitative and qualitative data, the later of which typically consist of text comments. This work utilizes text-mining techniques to assist medical educators in the analysis of residency evaluations to identify statement topics and perform sentiment analysis on statements. In addition to validation of these techniques, this work aims to correlate automated findings with objective trainee outcomes.
Discovery of drug-drug interactions from biomedical literature
DDI is a serious concern in clinical practice as physicians strive to provide the highest quality patient care. While DDI lists are commonly used in clinical practice to alert clinicians during prescribing, many DDIs resulting from various pathways are not widely known. Such interactions may be indirectly derived from the scientific literature through informatics methods. The objective of this study is to use semantic MEDLINE to uncover potential DDIs in clinical data.
Identification of new versus redundant information from clinical notes
In EHR systems, a clinician can create new notes by “copy and pasting” text from previous notes. Additionally, some EHR systems “pull” known information such as the medication list, past medical history, and other parts of the record directly into clinical notes. This results in significant amounts of redundant information in clinical texts, which make the readability and mental sifting of information in these notes difficult for practicing clinicians who must use these notes. Redundant information also increases the length of clinical notes and de-emphasizes important new information, thus placing an additional cognative load on clinicians who must read and synthesize these notes, who must often function in a time-constrained clinical environment with frequent interruptions. The goal of this research is to develop computational methods customized for clinical texts to identify new (non-redundant) information from the clinical notes in the EHR.
Information extraction from operative reports
As an important branch of medicine, surgery is concerned with treatment of injuries or disorders of the body through operative procedure interventions. Various factors such as technique used, incision length, or supplies used (e.g., mesh type, prosthetic) can affect surgical patient outcomes. Surgeons, who perform surgeries with specialized training in operative procedures, need to determine the best way to perform procedures based on accessible sources of the best evidence available. The goal of this research is to extract information on the techniques, instruments, materials, and other factors surrounding operative procedures from operative reports to build methods to efficiently extract the necessary information in a succinct and easily comprehensible fashion for secondary uses like summarization or use of this information for high-throughput clinical research.
Semantic similarity and relatedness
Identifying semantically similar and related terms in the biomedical and clinical domains have proven useful in a various Natural Language Processing (NLP) tasks such as Question-Answering and Information Extraction. The goal of this research on semantic similarity and relatedness seeks to develop methods that leverage domain knowledge contained within biomedical thesauri such as the Unified Medical Language System (UMLS) and clinical corpora to develop new methods, specific to clinical text, for computing semantic similarity and relatedness and incorporate semantic similarity and relatedness into NLP tasks.
We have developed the semantic similarity and relatedness package:
Word sense disambiguation of acronyms, abbreviations, and symbols
The goal of this research is to develop effective techniques for word sense disambiguation (WSD) of acronyms, abbreviations, and symbols in clinical documents, an essential and unsolved issue for effective medical NLP systems. A key step towards this is work to build a comprehensive clinical sense inventory based upon the integration of available biomedical resources and upon senses from a large corpus of clinical notes. Our research explores issues related to optimization of automated machine-learning techniques including minimization of sample sizes, the contributions and values of different feature types, use of semi- and unsupervised techniques, techniques to deal with rare sense detection, and variation in window size and orientation used for extraction of features with machine-learning algorithms.
Our work includes de-identified resources for symbols and acronyms/abbreviations available for research purposes: