Apache cTAKES™

Introduction

The Apache™ clinical Text Analysis and Knowledge Extraction System (cTAKES™) focuses on extracting knowledge from clinical text through Natural Language Processing (NLP) techniques.

cTAKES is engineered in a modular fashion and employs leading-edge rule-based and machine learning methods.

cTAKES has standard features for biomedical text processing software, including the ability to extract concepts such as symptoms, procedures, diagnoses, medications and anatomy with attributes and standard codes.

More powerful components can perform tasks as complex as identifying temporal events, dates and times – resulting in placement of events in a patient timeline.

Components are trained on gold standards from the biomedical as well as the general domain. This affords usability across different types of clinical narrative (e.g. radiology reports, clinical notes, discharge summaries) in various institution formats as well as other types of health-related narrative (e.g. twitter feeds), using multiple data standards (e.g. Health Level 7 (HL7), Clinical Document Architecture (CDA), Fast Healthcare Interoperability Resources (FHIR), SNOMED-CT, RxNORM).

cTAKES is the NLP platform for many initiatives across the world covering a variety of research purposes and large datasets. Contributors include professionals at medical and commercial institutions, NLP and Machine Learning researchers, Medical Doctors, and students of many disciplines and levels. We encourage people from all backgrounds to get involved! (link)

Supported Environments

Java 1.8 is required to run cTAKES versions 5.x and older. Versions 6+ require java 17. Run this command to check your Java version:

$ java -version

Maven 3 is required to build cTAKES. Run this to command to check your Maven version:

$ mvn -version

A license for the Unified Medical Language System (UMLS) is required to use the named entity recognition module (dictionary lookup) with the default dictionary.
Python 3 is required to use cTAKES Python Bridge to Java (PBJ). Run this to command to check your Python version:

$ python -V

Getting Started

New Users

The easiest way for new users to get a jump start running cTAKES is to use the Standard Pipeline Installation Facility. The Standard Pipeline Installation Facility is a tool that can install cTAKES configured to run the most popular cTAKES pre-built pipelines. You can then use the Piper File Submitter GUI to submit jobs or submit them from the command line.

For access to all cTAKES capabilities, download a zip or tar.z file containing a fully-built installation of the most recent cTAKES release. Then, after obtaining a UMLS license, use the UMLS Package Fetcher GUI to install a copy of the default dictionary for Named Entity Recognition (NER) using cTAKES Fast Dictionary Lookup.

New Developers

Notice: cTAKES 7.0.0-SNAPSHOT requires jdk 17 to build and run.

All source code for cTAKES versions 5+ is available from the cTAKES GitHub repository.

Clone this repository

$ git clone https://github.com/apache/ctakes.git

Open your local copy of the repository in an IDE of your choice.
Run directly from the code (link).
or
Build a binary installation (link), and
Run a binary installation (link).

More information

Much more information can be found on the cTAKES wiki.

You can also write to the cTAKES user and developer mailing lists: user at ctakes.apache.org and dev at apache.ctakes.org and find answers to previously asked questions by searching the user and developer mail archives.

Name		Name	Last commit message	Last commit date
Latest commit History 140 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
ctakes-assertion-zoner		ctakes-assertion-zoner
ctakes-assertion		ctakes-assertion
ctakes-chunker		ctakes-chunker
ctakes-clinical-pipeline		ctakes-clinical-pipeline
ctakes-constituency-parser		ctakes-constituency-parser
ctakes-context-tokenizer		ctakes-context-tokenizer
ctakes-core		ctakes-core
ctakes-coreference		ctakes-coreference
ctakes-dependency-parser		ctakes-dependency-parser
ctakes-dictionary-lookup-fast		ctakes-dictionary-lookup-fast
ctakes-dictionary-lookup		ctakes-dictionary-lookup
ctakes-distribution		ctakes-distribution
ctakes-dockhand		ctakes-dockhand
ctakes-drug-ner		ctakes-drug-ner
ctakes-examples		ctakes-examples
ctakes-fhir		ctakes-fhir
ctakes-gui		ctakes-gui
ctakes-lvg		ctakes-lvg
ctakes-mastif-zoner		ctakes-mastif-zoner
ctakes-ne-contexts		ctakes-ne-contexts
ctakes-pbj		ctakes-pbj
ctakes-pos-tagger		ctakes-pos-tagger
ctakes-preprocessor		ctakes-preprocessor
ctakes-regression-test		ctakes-regression-test
ctakes-relation-extractor		ctakes-relation-extractor
ctakes-side-effect		ctakes-side-effect
ctakes-smoking-status		ctakes-smoking-status
ctakes-template-filler		ctakes-template-filler
ctakes-temporal		ctakes-temporal
ctakes-tiny-rest		ctakes-tiny-rest
ctakes-type-system		ctakes-type-system
ctakes-user-resources		ctakes-user-resources
ctakes-utils		ctakes-utils
ctakes-web-rest		ctakes-web-rest
ctakes-ytex-uima		ctakes-ytex-uima
ctakes-ytex-web		ctakes-ytex-web
ctakes-ytex		ctakes-ytex
.asf.yaml		.asf.yaml
.gitattributes		.gitattributes
.gitignore		.gitignore
KEYS		KEYS
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Apache cTAKES™

Introduction

Supported Environments

Getting Started

New Users

New Developers

More information

About

Releases 3

Packages

Contributors 4

Languages

License

apache/ctakes

Folders and files

Latest commit

History

Repository files navigation

Apache cTAKES™

Introduction

Supported Environments

Getting Started

New Users

New Developers

More information

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 4

Languages

Packages