Name: DSSG2017-Equity
Owner: UW eScience Institute
Description: Free and open-source mapping tools and data workflow for visualizing neighborhood data
Forked from: DCActionforChildren/dcaction
Created: 2017-06-21 17:39:39.0
Updated: 2017-08-18 21:44:20.0
Pushed: 2017-08-18 21:44:17.0
Homepage: https://uwescience.github.io/DSSG2017-Equity/
Size: 51518
Language: Jupyter Notebook
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
DC Action for Children, the Kids Count grantee for the District of Columbia, in partnership with DataKind and dozens of volunteers, created this map of measures of child well-being. This tool allows residents, program providers, and others to easily access and view this data. Users select from a list of available data layers to display the data for a given “neighborhood cluster,” a geographic type used in DC, as well as the neighborhood's demographic composition.
The tool currently uses DC's geographies and data which can be repurposed for other Kids Count grantees, or other projects and purposes. The code that powers the tool is free and open source, meaning others can copy, make changes, and redeploy it. DC Action for Children worked with DataKind DC to develop the map. Across the US and around the world, there are communities of volunteers who can support efforts to create a “fork” of this tool specific to your city, state, or region. Read about examples from around the globe.
The Data Tools are an online map of neighborhood data that is powered by two easy-to-edit documents:
For the DC Action Data Tools deployment, the map file has the 39 DC neighborhood clusters and the spreadsheet file has 65 columns with half of them from the US Census Bureau and half of them from other local sources. The data from these primary sources is “crosswalked”, or recalculated, from original geographic areas (some are zipcodes, some are census tracts, some are points) to match the neighborhood cluster areas chosen for map layers in this deployment. There are also a few points files for schools, hospitals, and libraries that display as individual points on the map, instead of as shaded geographic areas. There are also two configuration files called fields and sources that can be updated as described in the Data section below.
This section talks about how to set up your own version.
The first part steps you through that process of deploying your own. The second part talks about the key features that may be useful to you depending on where you are in the world.
What category best describes you?
If you chose A or B, here's how you get started:
If you choose C, here's how you get started:
Where are you in the world?
If you are anywhere in the world, you can:
If you are in the US, you can also use:
If you are also in DC, you can also use:
This section talks about the data that powers the visualization.
In the below, you can learn more about:
The DC Action Data Tool has posted the data sources on its website: https://www.dcactionforchildren.org/dc-kids-count-data-tools-methodology
The status of each of these sources is tracked in this Google Spreadsheet: https://docs.google.com/spreadsheets/d/1uF2nm5CS4tgrx9owv59VBaLnPGYVfXBkRhxT5auQ-3k
There are two functionally identical scripts to retrieve Census data, one written in Ruby and the other in Python. Check out the documentation for the Ruby script.
In order to make the tool easier to maintain, all data is stored and updated in spreadsheets. The below instructions pertain to the DC Action data, which can be found in a Google Spreadsheet, but could also be powered by an Excel spreadsheet, or any other tool that can output CSVs in the existing format.
In addition to the data described in the Updating Layer Data section above, which colors neighborhoods according to their value, we also have the ability to add “points” to the map. These are points of interest like schools, libraries, and hospitals.
A crosswalk is a means of translating data that is aggregated at one geographic level, such as census tract, to another, such as a neighborhood cluster. We do this by using a crosswalk table, which shows the relationships (overlap) between the two. In the case of the above example, this table would contain columns with 1) the census tract ID, 2) the neighborhood cluster ID, and 3) the proportion of the census tract that is contained in the neighborhood.
Crosswalks are commonly used to express relationships between different levels of geography. We can simply and reliably crosswalk county level data to the state level, for example, because counties cluster within states and we know the population characteristics at each level. We have to make more assumptions when cross-walking data from tract to neighborhood because neighborhood boundaries do not follow census boundary lines and we do not have benchmark estimates of population characteristics at the neighborhood level.
The first assumption we make is that the population is uniformly distributed across the tract. For most tracts, this assumption is not relevant because the entire tract is contained within one neighborhood cluster. It is only important for tracts that cross neighborhood boundaries. In order to allocate the population across those boundaries, we use the proportion of the tract's land area that overlaps each neighborhood as the apportioning factor. The tract level population is apportioned into the two neighborhoods according to the proportion of its land area that is covered by each neighborhood.
The next assumption we make is that tracts are socioeconomically integrated. When we apportion tract level data on characteristics such as poverty, we apply the same land area apportion factor used for the total population. This implies that tracts will not contain concentrated areas of poverty, for example.
The wiki contains more information on the crosswalking DC Action applied to the data.
brew install python
and then pip install pandas
and pip install numpy
pip install pandas
and pip install numpy
from the terminal.python crosswalk.py
.