AUTOMATED DE-IDENTIFICATION OF DISTRIBUTIONAL SEMANTICS MODELS
FAMILY HISTORY EXTRACTION FROM CLINICAL TEXTS
Family history information is essential for understanding disease risk. It is, more specifically, critical for individualized disease prevention, diagnosis, and treatment. Our previous work has included analyzing the representation of family history information in the EHR and developing a more comprehensive family history representation model. BioMedICUS includes a family history module, which identifies family history statements, observations (e.g., disease or procedure), relative or side of family with attributes (i.e., vital status, age of diagnosis, certainty, and negation), and predications (“indicator phrases”) that are used to establish relationships between observations and family member.
HL7/LOINC DOCUMENT ONTOLOGY: ROLE AXIS EVALUATION
MTAP is a foundational framework that enables users to create text analysis and NLP components. MTAP bridges the gap between idea prototyping and production-scale deployments by providing distributed data models and processing tools using gRPC as an underlying communication framework. MTAP allows for support of Python and Java based components to work in tandem and is designed with ease-of-use in mind to facilitate users with minimal development experience.
SEMANTIC SIMILARITY AND RELATEDNESS PACKAGE
We present NLP Type and Annotation Browser (NLP-TAB), an open-source system that facilitates exploration and analysis of NLP applications and their components without prior knowledge of their implementation. By storing and analyzing the results produced by each NLP application on one or more corpora using a type-agnostic data model, we allow users to discover which annotations best match their specific information retrieval tasks, as well as, run comparisons between annotation types of separate applications.
The ultimate goal of NLP-TAB is to facilitate the development and deployment of information extraction systems that make use of the results of multiple NLP applications developed using the Apache Unstructured Information Management (UIMA) platform (http://uima.apache.org/), maximizing their relative strengths and minimizing their weaknesses. To reach that goal, NLP-TAB has a threefold purpose. First, it allows users to explore and evaluate disparate NLP applications and the annotations they create through several visualization and information retrieval techniques. Second, it combines the results of different NLP systems for subsequent information retrieval. Here, leveraging multiple NLP applications may improve accuracy and reliability of information extraction from medical texts particularly when the NLP applications produce complementary results. NLP-TAB is designed to elucidate the degree to which different NLP applications are complementary. Third, NLP-TAB may eventually enable the reuse and interoperability of components from different pipelines through analysis and unsupervised creation of mappings between data types.