Integrated Clinical Document Processing and Curation with Linguamatics I2E
LabKey’s clinical document abstraction solution with Linguamatics I2E natural language processing engine integration is designed to streamline the abstraction and curation of unstructured data files. Teams using LabKey + I2E can significantly reduce the number of manual processes involved in extracting structured data from free-text documents and reports by automating the acquisition, processing and assignment of files for curation.
Automated integration of free text documents via LabKey ETLs streamlines the integration process. Files can also be uploaded manually using the File Management system offering additional flexibility to add documents ad hoc or in small batches.
Teams can define document abstraction and review workflows that support their specific research scenario and ensure that documents follow a consistent curation process. Workflows can be a combination of automated and manual processes
Reporting features in LabKey provide a complete view of an organization’s curation operations. Managers can use information about abstractor workloads and metrics like average abstraction time per document to help balance and optimize operations.
Processing Pipeline with I2E
Files added to LabKey will be run through the integrated Linguamatics i2E NLP engine as part of an integrated data processing pipeline, before they become available in the UI for abstraction. I2E indexes documents and extract target values for review by abstractors.
A curation UI presents abstractors with a side-by-side view of unstructured documents and the data fields for abstraction, allowing them to efficiently review and record data points in a single screen. Abstractors and reviewers can easily toggle between documents in their queue and monitor their progress through an assigned document batch.
Querying & Analysis
The resulting data generated from the curation process is stored in structured format allowing users to conduct simple and complex queries to locate datasets of interest.
About Linguamatics I2E Natural Language Processing Engine
I2E is an agile and interactive text mining platform for the extraction and analysis of information. Linguamatics uses a powerful blend of methods, including Machine Learning, for high precision and recall.
Linguamatics I2E Extracts:
- Pathology report data including cancer histology, grade, and behavior, biomarker value, and cancer stage
- Patient profiling including diseases, medications, lab values etc.
- Social determinants of health and lifestyle factors to support population health analytics
- Phenotypic characteristics to support genotype-phenotype studies using human phenotype ontology
Key I2E Capabilities include:
- Data Discovery – using large scale unannotated data sets analysts can rapidly, iteratively develop algorithms, saving months compared to manual chart review/gold standard development
- Democratized NLP – an intuitive GUI provides broader user access to NLP, with no coding or scripting required
- Programmatic Workflow integration – SOA friendly RESTful web services provide fail-safe and recoverable NLP processes for tight integration with LabKey’s ETL (Extract, Transform, Load) workflows