datamade/metro-pdf-merger

Name: metro-pdf-merger

Owner: datamade

Description: null

Created: 2017-03-15 19:29:23.0

Updated: 2018-03-22 21:58:33.0

Pushed: 2018-03-22 21:58:32.0

Homepage: null

Size: 169

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Metro PDF Merger

A flask app that listens for requests from LA Metro Councilmatic. The app consolidates PDFs for Board Reports and Events, stores the merged documents, and provides a route that returns PDFs.

Set up

Copy the config.py.example file. It has everything you need to run the app. If you want to configure the app with Sentry, then find your DSN Client key and assign its value to SENTRY_DSN.

onfig.py.example config.py

Create a virtualenv:

rtualenv metro-merger

Install dependencies:

install -r requirements.txt

The Metro PDF Merger uses unoconv, a CLI tool that performs document conversions; it reads any document type supported by LibreOffice.

Mac OS

Install unoconv with brew:

 install unoconv

The brew installation comes with a caveat: unoconv works only with LibreOffice versions 3.6.0.1 - 4.3.x. Get the DMG file for version 4.3. Or visit here.

Ubuntu

On Linux, but also on any operting system, you may chose to partially install LibreOffice, which helps to keep your server safe from attacks (smaller surface area for potential invasion) and free of the heavy-weight packaging in the full LibreOffice suite.

Install libreoffice-script-provider-python and the necessary packages from LibreOffice:

get install libreoffice-script-provider-python
get install libreoffice-writer
get install libreoffice-calc
get install libreoffice-impress

Then, install unoconv from source:

 the following as the datamade user:
r unoconv
noconv
 https://raw.githubusercontent.com/dagwieers/unoconv/master/unoconv
sign read, write, and execute permissions to unoconv source file
d 755 unoconv
ke a symbolic link
 ln -s /home/datamade/unoconv/unoconv /usr/bin/unoconv

In the unoconv file, specify the location of Python:

sr/bin/python3
Get started

Run the app locally:

on app.py

This app uses Redis, a data store that brokers messages between a sender and receiver. You need to download Redis, first. Then, you can put Redis to “work.” In a new terminal tab, run:

on run_worker.py

This module calls queue_daemon, a while loop that processes entries in the Redis queue, or in other words, runs the makePacket function, which merges and saves the newly consolidated PDFs.

This app serves the needs of LA Metro Councilmatic. Learn about setting up an instance of LA Metro Councilmatic. LA Metro comes with a management command that queries the Metro database and sends post requests. Each request carries a JSON object, which contains URLs that point to bill documents on Legistar (i.e., the documents that metro-pdf-merger consolidates).

In the LA Metro repo, find settings.py and change MERGER_BASE_URL. It should point to your flask app, for instance:

ER_BASE_URL = 'http://0.0.0.0:5000'

Then, run the management command in your LA Metro repo:

ab all documents
on manage.py compile_pdfs --all_documents

ab only the most recently added documents
on manage.py compile_pdfs
AWS Buckets

We store the merged PDF packets in an AWS S3 bucket. You may want to test this tool locally, but still send PDFs to AWS. To so, you need to have the right credentials, and you need to tell your app to send PDFs to our test S3 bucket.

r ~/.aws
h ~/.aws/credentials
h ~/.aws/config
.aws/credentials
ault]
access_key_id = ****
secret_access_key = ****

.aws/config
ault]
on = us-east-1

Credentials set!

Finally, tell the app where to save merged PDFs. Add the following to config.py:

UCKET = 'datamade-metro-pdf-merger-testing'

Head over to the AWS console, and watch Metro PDF packets appear!

Team
Errors / Bugs

If something is not behaving intuitively, it is a bug, and should be reported. Report it here: https://github.com/datamade/nyc-councilmatic/issues

Note on Patches/Pull Requests
Copyright

Copyright (c) 2017 DataMade. Released under the MIT License.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.