nprapps/geocode-nominatim

Name: geocode-nominatim

Owner: NPR visuals team

Description: Geocode structured & unstructured addresses using Nominatim service

Created: 2017-04-07 00:20:15.0

Updated: 2017-04-07 00:24:09.0

Pushed: 2017-04-07 00:29:56.0

Homepage: null

Size: 8

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

geocode-nominatim

What is this?

Geocode addresses using Nominatim geocode service.

It uses a simple cache file to optimize the need for redundant geolocation.

If you plan to do a big geocoding batch, please contact Nominatim geocode service to let them know that you are planning to do so. They will point you to the right time to execute or give you some recommendations over what frequency to use, etc.

Assumptions

The following things are assumed to be true in this documentation.

For more details on the technology stack used with the app-template, see our development environment blog post.

This code should work fine in most recent versions of Linux, but package installation and system dependencies may vary.

What's in here?

The project contains the following folders and important files:

Bootstrap the project

To bootstrap the project:

clone git@github.com:nprapps/geocode-nominatim.git
eocode-nominatim
rtualenv geocode-nominatim
install -r requirements.txt
Geocode unstructured data

In order to geocode an unstructured address, create a csv file with the following headers:

hon geocode.py $CSVFILE

Where $CSVFILE is the path to the csv file on your hard drive

The results will be stored in the output folder

Geocode structured data

In order to geocode an unstructure address, create a csv file with the following headers:

Fill one or as many as the fields as you need to specify the location that you want to geocode. Then run the script

hon geocode.py $CSVFILE

Where $CSVFILE is the path to the csv file on your hard drive

The results will be stored in the output folder

Geocode mixed data

If you have a mix of unstructured and structured location then create a csv file with the following headers:

Fill either the address for the unstructured locations and one or as many as the fields as you need to specify the location that you want to geocode for structured locations. Then run the script

hon geocode.py $CSVFILE

Where $CSVFILE is the path to the csv file on your hard drive

The results will be stored in the output folder

Advanced Configuration

The geocode.py scripts can be customized with some advanced behaviors

Debugging

You can add a debug flag to the script to have a more verbose execution

hon geocode.py $CSVFILE -d
Sample

If you want to test the execution on a sample of the data prior to launching the full dataset then:

hon geocode.py $CSVFILE -s $SAMPLE_SIZE

Where $SAMPLE_SIZE is the number of lines to be used for the sample from the beginning of the csv

No Cache

The script uses a file based cache to optimize the number of requests to the Nominatim service.

If you do not want to use the file cache at all add the no-cache flag like this:

hon geocode.py $CSVFILE -C
Wait between geocoding executions

You can customize the number of seconds to wait between consecutive executions of the geocoding service:

hon geocode.py $CSVFILE -w $WAIT_SECONDS

Where $WAIT_SECONDS is the number of seconds to wait until the next execution


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.