Name: pdf-to-markdown
Owner: g0v
Description: Convert PDF files into markdown files
Created: 2015-05-23 16:43:23.0
Updated: 2015-05-23 16:43:24.0
Pushed: 2015-05-24 14:20:38.0
Homepage: null
Size: 6558
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
This is NOT a general-purpose converter. Currently only for urban planning document in Taiwan.
From this PDF file, we generate:
You should install pdfminer first.
sudo apt-get install python-pdfminer
git clone git@github.com:euske/pdfminer.git
cd pdfminer
make cmap
sudo python setup.py install
The make cmap
is necessary for documents containing Chinese characters.
Just type
python main.py <pdf>
For example, you can use our example PDF file:
python main.py examples/neihu.pdf