datamade/elpc_bakken

Name: elpc_bakken

Owner: datamade

Description: Bakken well files PDF extraction

Created: 2015-06-24 15:50:48.0

Updated: 2017-01-05 16:59:42.0

Pushed: 2015-08-11 17:27:37.0

Homepage: null

Size: 1788913

Language: HTML

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

ELPC Bakken Well File PDF Extraction

This Makefile extracts text from PDFs, OCR images in PDFS, and extracts data.

requirements
do apt-get install tesseract-ocr ocrfeeder poppler-utils
To run

To parallelize task use the -j command make -j 8 will use 8 processes.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.