dssg/obscuritext

Name: obscuritext

Owner: Data Science for Social Good

Description: Transform text to be unreadable but still somewhat useful

Created: 2018-01-08 18:57:55.0

Updated: 2018-04-23 00:39:09.0

Pushed: 2018-02-19 13:28:03.0

Homepage: null

Size: 86

Language: Jupyter Notebook

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

obscuritext

A Python script to transform free text into de-identified data by hashing each word. These hashes can still be used for statistical modeling while effectively making the text unreadable.

Usage

To run the obscuring process:

Requirements

The text to be obscured should be stored in a named column in a CSV file. This file should be stored in the same directory as the script file, and its name should not contain spaces. This name must be specified in the configuration file.

To run, use: python3 text_obscure.py

The script uses the following packages for Python3:

Configurations

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.