Name: nlc-icd10-classifier
Owner: International Business Machines
Description: A simple web app that shows how Watson's Natural Language Classifier (NLC) can classify ICD-10 code. The app is written in Python using the Flask framework and leverages the Watson Developer Cloud Python SDK
Created: 2017-11-24 19:14:39.0
Updated: 2018-05-03 15:28:18.0
Pushed: 2018-04-24 15:49:08.0
Homepage: https://developer.ibm.com/code/patterns/classify-icd-10-data-with-watson/
Size: 2581
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
DISCLAIMER: This application is used for demonstrative and illustrative purposes only and does not constitute an offering that has gone through regulatory review. It is not intended to serve as a medical application. There is no representation as to the accuracy of the output of this application and it is presented without warranty.
This application was built to demonstrate IBM's Watson Natural Language Classifier (NLC). The data set we will be using, ICD-10-GT-AA.csv, contains a subset of ICD-10 entries. ICD-10 is the 10th revision of the International Statistical Classification of Diseases and Related Health Problems. In short, it is a medical classification list by the World Health Organization (WHO) that contains codes for: diseases, signs and symptoms, abnormal findings, complaints, social circumstances, and external causes of injury or diseases. Hospitals and insurance companies alike could save time and money by leveraging Watson to properly tag the most accurate ICD-10 codes.
This application is a Python web application based on the Flask microframework, and based on earlier work done by Ryan Anderson. It uses the Watson Python SDK to create the classifier, list classifiers, and classify the input text. We also make use of the freely available ICD-10 API which, given an ICD-10 code, returns a name and description.
When the reader has completed this pattern, they will understand how to:
Here we create the classifier with our ICD-10 dataset.
Clone this project: git clone git@github.com:IBM/nlc-icd10-classifier.git
and cd
into the new directory.
We'll be using ICD-10-GT-AA.csv
dataset in the data
folder
Note that this is a subset of the entire ICD-10 classification set, which allows faster training time
Create an NLC service in IBM Cloud, make a note of the service name used in the catalog, we'll need this later.
Create service credentials by using the menu on the left and selecting the default options.
Export the username and password as environment variables and then load the data using the command below. This will take around 3 hours.
rt USERNAME=<username_from_credentials>
rt PASSWORD=<pasword_from_credentials>
rt FILE=data/ICD-10-GT-AA.csv
-i --user "$USERNAME":"$PASSWORD" -F training_data=@$FILE -F training_metadata="{\"language\":\"en\",\"name\":\"ICD-10Classifier\"}" "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers"
After running the command to create the classifier, note the classifier_id
in the json that is returned:
lassifier_id" : "ab2aa6x341-nlc-1176",
ame" : "ICD-10Classifier",
anguage" : "en",
reated" : "2018-04-18T14:09:28.403Z",
rl" : "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/ab2aa6x341-nlc-1176",
tatus" : "Training",
tatus_description" : "The classifier instance is in its training phase, not yet ready to accept classify requests"
and export that as an environment variable:
rt CLASSIFIER_ID=<my_classifier_id>
Now you can check the status for training your classifier:
--user "$USERNAME":"$PASSWORD" "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers/$CLASSIFIER_ID"
This application can be run locally or hosted on IBM Cloud, follow the steps below depending on your deployment choice
Clone this project: git clone git@github.com:IBM/nlc-icd10-classifier.git
cd
into this project's root directory
(Optionally) create a virtual environment: virtualenv my-nlc-classifier
. my-nlc-classifier/bin/activate
Run pip install -r requirements.txt
to install the app's dependencies
Copy the env.example
file to .env
Update the .env
file with your NLC credentials:
place the credentials here with your own.
name this file to .env before running run.py.
RAL_LANGUAGE_CLASSIFIER_USERNAME=<add_NLU_username>
RAL_LANGUAGE_CLASSIFIER_PASSWORD=<add_NLU_password>
Run python welcome.py
Access the running app in a browser at http://localhost:5000
Clone this project: git clone git@github.com:IBM/nlc-icd10-classifier.git
cd
into this project's root directory
Update manifest.yml
with the NLC service name (your_nlc_service_name
), a unique application name (your_app_name
) and unique host value (your_app_host
)
ications:
path: .
mory: 256M
stances: 1
main: mybluemix.net
me: your_app_name
st: your_app_host
sk_quota: 1024M
rvices:
your_nlc_service_name
ildpack: python_buildpack
Run bluemix app push
from the root directory
Access the running app by going to: https://<host-value>.mybluemix.net/
If you've never run the
bluemix
command before there is some configuration required, refer to the official IBM Cloud CLI docs to get this set up.
The user inputs information into the Text to classify:
box and the Watson NLC classifier will return ICD10 classifications with confidence scores.
Here is the output for the input Gastrointestinal hemorrhage
: