Name: liveblog
Owner: NPR visuals team
Description: NPR Liveblog Rig based on Google Docs and Google Apps Scripts
Created: 2017-01-03 11:51:51.0
Updated: 2017-09-07 16:26:10.0
Pushed: 2018-05-14 20:11:49.0
Homepage: null
Size: 584
Language: JavaScript
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
NPR Liveblog app based on google docs and google app scripts
The following things are assumed to be true in this documentation.
For more details on the technology stack used with the app-template, see our development environment blog post.
The project contains the following folders and important files:
confs
– Server configuration files for nginx and uwsgi. Edit the templates then fab <ENV> servers.render_confs
, don't edit anything in confs/rendered
directly.data
– Data files, such as those used to generate HTML.fabfile
– Fabric commands for automating setup, deployment, data processing, etc.etc
– Miscellaneous scripts and metadata for project bootstrapping.jst
– Javascript (Underscore.js) templates.less
– LESS files, will be compiled to CSS and concatenated for deployment.templates
– HTML (Jinja2) templates, to be compiled locally.tests
– Python unit tests.www
– Static and compiled assets to be deployed. (a.k.a. “the output”)www/assets
– A symlink to an S3 bucket containing binary assets (images, audio).www/live-data
– “Live” data deployed to S3 via cron jobs or other mechanisms. (Not deployed with the rest of the project.)www/test
– Javascript tests and supporting files.app.py
– A Flask app for rendering the project locally.app_config.py
– Global project configuration for scripts, deployment, etc.crontab
– Cron jobs to be installed as part of the project.package.json
– Contains both server-side and client-side javascript dependencies and scripts for webpackparse_doc.py
– Contains the google doc html parser functionalitypublic_app.py
– A Flask app for running server-side code.render_utils.py
– Code supporting template rendering.requirements.txt
– Python requirements.static.py
– Static Flask views used in both app.py
and public_app.py
.webpack.config.js
– Webpack configuration for server-side/localwebpack.production.config.js
– Webpack configuration for generating JS to go to staging/productionNode.js is required for the static asset pipeline. If you don't already have it, get it like this:
install node
https://npmjs.org/install.sh | sh
MongoDB is used to cache the ratios of our visual assets so that we do not need to download it everytime the parser runs, if you do not have mongo installed run:
install mongodb
Then bootstrap the project:
iveblog
rtualenv liveblog
install -r requirements.txt
install
update
Problems installing requirements? You may need to run the pip command as ARCHFLAGS=-Wno-error=unused-command-line-argument-hard-error-in-future pip install -r requirements.txt
to work around an issue with OSX.
Project secrets should never be stored in app_config.py
or anywhere else in the repository. They will be leaked to the client if you do. Instead, always store passwords, keys, etc. in environment variables and document that they are needed here in the README.
Any environment variable that starts with liveblog
will be automatically loaded when app_config.get_secrets()
is called.
Large media assets (images, videos, audio) are synced with an Amazon S3 bucket specified in app_config.ASSETS_S3_BUCKET
in a folder with the name of the project. (This bucket should not be the same as any of your app_config.PRODUCTION_S3_BUCKETS
or app_config.STAGING_S3_BUCKETS
.) This allows everyone who works on the project to access these assets without storing them in the repo, giving us faster clone times and the ability to open source our work.
Syncing these assets requires running a couple different commands at the right times. When you create new assets or make changes to current assets that need to get uploaded to the server, run `fab assets.sync
`. This will do a few things:
Unfortunantely, there is no automatic way to know when a file has been intentionally deleted from the server or your local directory. When you want to simultaneously remove a file from the server and your local environment (i.e. it is not needed in the project any longer), run `fab assets.rm:"www/assets/file_name_here.jpg"
`
A site can have any number of rendered pages, each with a corresponding template and view. To create a new one:
templates
directory. Ensure it extends _base.html
.app.py
. Decorate it with a route to the page name, i.e. @app.route('/filename.html')
.html
and do not start with _
will automatically be rendered when you call fab render
.In order to run the project locally you'll need three components running:
To run mongo locally in a terminal window run:
od --config /usr/local/etc/mongod.conf
If you want to live update a Google Doc locally, you will also need to run the daemon locally. You can do that with:
daemons.main
Finally, a flask app is used to run the project locally. It will automatically recompile templates and assets on demand.
on liveblog
app
Visit localhost:7777 in your browser.
Do you use iTerm2 as your terminal app? Here's a sample AppleScript to automatically launch a four-paned terminal window (one to access the repo, one for the local webserver, one for the daemon and one for mongo).
You can save this locally, customize it to match your own configuration and add an alias for it to your .bash_profile.
alias liveblog=“osascript ~/PATH-TO-FILE/Liveblog.scpt”
There was a lot of collaboration inside this project and during long periods of time we were all simultaneously working in different parts of the project's pipeline and required some stability on the rest of the pipeline to make some progress.
This was particularly true for the google document that we would use as source of the transcript, some of us were testing for quirks on the parsing side while other wanted to test navigation between annotations.
So we provided a way to override the app configuration locally. In order to do so you will need to create a file called local_settings.py
on your project root.
The properties that you can override are:
TRANSCRIPT_GDOC_KEY
: The google doc key used as the input to our parsing processGAS_LOG_KEY
: The google spreadsheet that stores the logs from the google app script executionS3_BASE_URL
: Useful if you want to override the default port of the local server.There are oher properties that you can set up but they will be better explained over the next section.
Sometimes it is not a live event that you want to fact check but a straight-from-the-oven text that has just been released. This is a more static approach, but there's still a lot of value on the repo that can be used in a non-live situation, like the parsing and all the client code that generates the final embed with tracking of individual annotations, etc.
In this particular case we would not use the google app script side of this repo, since we are not going to need to be pulling a transcript periodically from an API, also we may want to generate the parsing locally and just sent the results to S3 to create a static version of the application.
By default, this repo is configured to be used for a live event situation, but using local_settings.py
to override configuration we can turn it into a more static approach. Here are the properties that you can change:
DEPLOY_TO_SERVERS
: Turn it to False
if you plan on deploying a static appDEPLOY_STATIC_FACTCHECK
: Turn it to True
so that the fabric deploy
command will also issue the parsing of the last transcript and add it to the deploy process to S3.CURRENT_LIVEBLOG
: Bucket where you want to deploy the applicationThis repo expects a google doc that has certain format in order to be able to parse it. In order to create such a doc we use a google apps script addon that allows to insert posts read more about it here
We are accessing the Live Fact Check document from the server to pull out its content using credentials associated with nprappstumblr@gmail.com
we need to make sure that nprappstumblr@gmail.com
has at least read access to the document in order to avoid a 403
response to the server.
This app uses a Google Spreadsheet for a simple key/value store that provides an editing workflow.
To access the Google doc, you'll need to create a Google API project via the Google developer console.
Enable the Drive API for your project and create a “web application” client ID.
For the redirect URIs use:
http://localhost:8000/authenticate/
http://127.0.0.1:8000/authenticate
http://localhost:8888/authenticate/
http://127.0.0.1:8888/authenticate
For the Javascript origins use:
http://localhost:8000
http://127.0.0.1:8000
http://localhost:8888
http://127.0.0.1:8888
You'll also need to set some environment variables:
rt GOOGLE_OAUTH_CLIENT_ID="something-something.apps.googleusercontent.com"
rt GOOGLE_OAUTH_CONSUMER_SECRET="bIgLonGStringOfCharacT3rs"
rt AUTHOMATIC_SALT="jAmOnYourKeyBoaRd"
Note that AUTHOMATIC_SALT
can be set to any random string. It's just cryptographic salt for the authentication library we use.
Once set up, run fab app
and visit http://localhost:8000
in your browser. If authentication is not configured, you'll be asked to allow the application for read-only access to Google drive, the account profile, and offline access on behalf of one of your Google accounts. This should be a one-time operation across all app-template projects.
It is possible to grant access to other accounts on a per-project basis by changing GOOGLE_OAUTH_CREDENTIALS_PATH
in app_config.py
.
View the sample copy spreadsheet.
This document is specified in app_config
with the variable COPY_GOOGLE_DOC_KEY
. To use your own spreadsheet, change this value to reflect your document's key. (The long string of random looking characters in your Google Docs URL. For example: 1DiE0j6vcCm55Dyj_sV5OJYoNXRRhn_Pjsndba7dVljo
)
A few things to note:
key
, there is expected to be a column called value
and rows will be accessed in templates as key/value pairsThe app template is outfitted with a few fab
utility functions that make pulling changes and updating your local data easy.
To update the latest document, simply run:
text.update
Note: text.update
runs automatically whenever fab render
is called.
At the template level, Jinja maintains a COPY
object that you can use to access your values in the templates. Using our example sheet, to use the byline
key in templates/index.html
:
OPY.attribution.byline }}
More generally, you can access anything defined in your Google Doc like so:
OPY.sheet_name.key_name }}
You may also access rows using iterators. In this case, the column headers of the spreadsheet become keys and the row cells values. For example:
or row in COPY.sheet_name %}
ow.column_one_header }}
ow.column_two_header }}
ndfor %}
When naming keys in the COPY document, please attempt to group them by common prefixes and order them by appearance on the page. For instance:
e
ne
t_header
t_body
t_url
load_label
load_url
Want to edit/view the app's linked google spreadsheet, we got you covered.
We have created a simple Fabric task `spreadsheet
`. It will try to find and open the app's linked google spreadsheet on your default browser.
spreadsheet
If you are working with other arbitraty google docs that are not involved with the COPY rig you can pass a key as a parameter to have that spreadsheet opened instead on your browser
spreadsheet:$GOOGLE_DOC_KEY
For example:
spreadsheet:12_F0yhsXEPN1w3GOlQB4_NKGadXiRLOa9l-HQu5jSL8
ill open 270 project number-crunching spreadsheet
This project uses a custom font build powered by Fontello
If the font does not exist, it will be created when running fab update
.
To force generation of the custom font, run:
utils.install_font:true
Editing the font is a little tricky – you have to use the Fontello web gui. To open the gui with your font configuration, run:
utils.open_font
Now edit the font, download the font pack, copy the new config.json into this
project's fontello
directory, and run fab utils.install_font:true
again.
Sometimes, our projects need to read data from a Google Doc that's not involved with the COPY rig. In this case, we've got a helper function for you to download an arbitrary Google spreadsheet.
This solution will download the uncached version of the document, unlike those methods which use the “publish to the Web” functionality baked into Google Docs. Published versions can take up to 15 minutes up update!
Make sure you're authenticated, then call oauth.get_document(key, file_path)
.
Here's an example of what you might do:
copytext import Copy
oauth import get_document
read_my_google_doc():
file_path = 'data/extra_data.xlsx'
get_document('1z7TVK16JyhZRzk5ep-Uq5SH4lPTWmjCecvJ5vCp6lS0', file_path)
data = Copy(file_path)
for row in data['example_list']:
print '%s: %s' % (row['term'], row['definition'])
_my_google_doc()
Python unit tests are stored in the tests
directory. Run them with fab tests
.
With the project running, visit localhost:8000/test/SpecRunner.html.
staging master deploy
You can deploy to EC2 for a variety of reasons. We cover two cases: Running a dynamic web application (public_app.py
) and executing cron jobs (crontab
).
Servers capable of running the app can be setup using our servers project.
For running a Web application:
app_config.py
set DEPLOY_TO_SERVERS
to True
.app_config.py
set DEPLOY_WEB_SERVICES
to True
.fab staging master servers.setup
to configure the server.fab staging master deploy
to deploy the app.For running cron jobs:
app_config.py
set DEPLOY_TO_SERVERS
to True
.app_config.py
, set INSTALL_CRONTAB
to True
fab staging master servers.setup
to configure the server.fab staging master deploy
to deploy the app.You can configure your EC2 instance to both run Web services and execute cron jobs; just set both environment variables in the fabfile.
Cron jobs are defined in the file crontab
. Each task should use the cron.sh
shim to ensure the project's virtualenv is properly activated prior to execution. For example:
* * * ubuntu bash /home/ubuntu/apps/debates2/repository/cron.sh fab $DEPLOYMENT_TARGET cron_jobs.test
To install your crontab set INSTALL_CRONTAB
to True
in app_config.py
. Cron jobs will be automatically installed each time you deploy to EC2.
The cron jobs themselves should be defined in fabfile/cron_jobs.py
whenever possible.
Web services are configured in the confs/
folder.
Running fab servers.setup
will deploy your confs if you have set DEPLOY_TO_SERVERS
and DEPLOY_WEB_SERVICES
both to True
at the top of app_config.py
.
To check that these files are being properly rendered, you can render them locally and see the results in the confs/rendered/
directory.
servers.render_confs
You can also deploy only configuration files by running (normally this is invoked by deploy
):
servers.deploy_confs
Sometimes it makes sense to run a fabric command on the server, for instance, when you need to render using a production database. You can do this with the fabcast
fabric command. For example:
staging master servers.fabcast:deploy
If any of the commands you run themselves require executing on the server, the server will SSH into itself to run them.
The Google Analytics events tracked in this application are:
|Category|Action|Label|Value|
|——–|——|—–|—–|
|liveblog|tweet|location
||
|liveblog|facebook|location
||
|liveblog|email|location
||
|liveblog|new-comment||
|liveblog|open-share-discuss||
|liveblog|close-share-discuss||
|liveblog|summary-copied||
|liveblog|featured-tweet-action|action
|
|liveblog|featured-facebook-action|action
|
Released under the MIT open source license. See LICENSE
for details.
liveblog
was built by the NPR Visuals team.
See CONTRIBUTORS
for additional contributors