Name: election-transcriber
Owner: datamade
Description: :pencil2: Election Transcription Interface built in collaboration with National Democratic Institute
Created: 2015-02-06 16:28:29.0
Updated: 2017-12-08 20:18:17.0
Pushed: 2018-02-23 20:07:26.0
Size: 5398
Language: Java
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
A tool for digitizing election results data in the form of handwritten digits.
The instructions below should get you setup for a development environment. To get going in production, follow the instructions in DEPLOYMENT.md.
Install OS level dependencies:
Clone this repo & install app requirements
We recommend using virtualenv and virtualenvwrapper for working in a virtualized development environment. Read how to set up virtualenv.
Once you have virtualenvwrapper set up,
rtualenv et
clone git@github.com:datamade/election-transcriber.git
lection-transcriber
install -r requirements.txt
Create a PostgreSQL database for election transcriber If you aren't already running PostgreSQL, we recommend installing version 9.6 or later.
tedb election_transcriber
Create your own app_config.py
file
ranscriber/app_config.py.example transcriber/app_config.py
You will need to change, at minimum:
DB_USER
and DB_PW
to reflect your PostgreSQL username/password (by default, the username is your computer name & the password is '')
S3_BUCKET
to tell the application where to look for your cache of images
to transcribe
AWS_CREDENTIALS_PATH
tells the application where to find the CSV file
with your AWS credentials in it. By default, the application looks for
a file called credenitals.csv
in the root folder of the project.
You can also change the username, email and password for the initial user roles, defined by ADMIN_USER
, MANAGER_USER
, and CLERK_USER
Create your own alembic.ini
file
lembic.ini.example alembic.ini
You will need to change, at minimum, user
& pass
(to reflect your PostgreSQL username/password) on line 6
Initialize the database
bic upgrade head
Import images
on update_images.py
Run the app
on runserver.py
In another terminal, run the worker
on run_queue.py
Once the server is running, navigate to http://localhost:5000/
There is a script in the root folder of the project called
syncDriveFolder.py
. As you might guess, it's the script that is responsible
for syncing files from a Google Drive folder to an AWS S3 bucket.
Setup Google Service Account
ype": "service_account",
roject_id": "[name of the project]",
rivate_key_id": "[long hash]",
rivate_key": "[very very long hash]",
lient_email": "some-user@project-name.iam.gserviceaccount.com",
lient_id": "[long number]",
uth_uri": "https://accounts.google.com/o/oauth2/auth",
oken_uri": "https://accounts.google.com/o/oauth2/token",
uth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
lient_x509_cert_url": "[long URL]"
As was explained in the part where you download that, the contents of this file should be kept secret.
client_email
address from that JSON file.Setup AWS User
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1508430268000",
"Effect": "Allow",
"Action": [
"s3:*"
],
"Resource": [
"arn:aws:s3:::[bucket_name]/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::[bucket_name]"
]
}
]
To run the syncDriveFolder.py
script, just put the credentials file from
Google and the credentials file from AWS in the root folder of the project run
the script like
on syncDriveFolder.py -f [name_of_drive_folder] -n [name_of_election]
A full list of options for that script can be seen by running python
syncDriveFolder.py --help
.
e: syncDriveFolder.py [-h] [--aws-creds AWS_CREDS]
[--google-creds GOOGLE_CREDS] -n ELECTION_NAME -f
DRIVE_FOLDER [--capture-hierarchy]
and convert images from a Google Drive Folder to an S3 Bucket
onal arguments:
, --help show this help message and exit
aws-creds AWS_CREDS
Path to AWS credentials. (default:
/home/eric/code/election-transcriber/credentials.csv)
google-creds GOOGLE_CREDS
Path to Google credentials. (default:
/home/eric/code/election-transcriber/credentials.json)
ELECTION_NAME, --election-name ELECTION_NAME
Short name to be used under the hood for the election
(default: None)
DRIVE_FOLDER, --drive-folder DRIVE_FOLDER
Name of the Google Drive folder to sync (default:
None)
capture-hierarchy Capture a geographical hierarchy from the name of the
file. (default: False)