Name: people-paths
Owner: Laboratório Analytics
Description: Socio-spatio-temporal analysis of people movement in a city from bus ticketing data
Created: 2016-11-28 20:37:37.0
Updated: 2018-05-22 20:59:40.0
Pushed: 2018-05-22 20:59:39.0
Size: 3753
Language: Jupyter Notebook
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
People Paths is an application which performs a descriptive analysis on bus GPS and passenger ticketing data, finding paths taken by Public Transportation city users in a time period, and matching the paths origin/destination locations with city area social data: population, income and literacy rate.
Bus GPS Data
Buses GPS record for a given time period.
Bus Ticketing Data
Passenger ticketing record for a given time period.
Census Area Data
City census area data with information such as: population, income and literacy rate.
Tested on a 14.04 Ubuntu machine.
'deb http://cran.rstudio.com/bin/linux/ubuntu trusty/'
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9
apt-get -y update
apt-get -y upgrade
update-locale LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8
apt-get -y install r-base gunzip
R -e 'install.packages(c("dplyr", "lubridate", "stringr", "sp", "rgeos", "rgdal"), repos = "http://cran.rstudio.com/")'
Download Streaming Data files from owncloud repository
Unzip json files
gunzip doc*.txt.gz
Convert json to csv format
python json2csv.py doc1-file.txt doc1-file.csv file
ipt build_trips_locations_social_dataset.R <code.base.folderpath> <ticketing.data.filepath> <gps.data.filepath> metadata/41CURITI.shp 41CURITI metadata/socioeco.csv <output.filepath> <log.filepath>
Bus GPS Data
| LATITUDE| VEHICLE | LONGITUDE | LINECODE | DATETIME |
|:———:|:—————–:|:————:|:——–:| :————————-:|
| -25,351073 | V001 | -49,265108 | A | 25/06/2016 23:59:57 |
| -25,35078 | V001 | -49,26514 | A | 25/06/2016 23:59:47 |
| -25,350796 | V001 | -49,265528 | A | 25/06/2016 23:59:40 |
Bus Ticketing Data
| VEHICLECODE| LINENAME | CARDNUMBER | LINECODE | TIMESTAMP | |:———:|:—————–:|:————:|:——–:| :————————-:| | 00239 | LINHA A | 0000000000 | A | 25/06/16 06:14:03,000000 | | 00239 | LINHA B | 0000000001 | B | 25/06/16 06:28:13,000000 | | 00216 | LINHA C | 0000000002 | C | 25/06/16 08:11:54,000000 |
|card.num|line.code|o.sector.code|o.neigh.code|o.neigh.name|o.loc|o.timestamp|o.pop|o.income|o.num.literate|d.sector.code|d.neigh.code|d.neigh.name|d.loc|d.timestamp|d.pop|d.income|d.num.literate| |:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:| |000001|260|410690205040042|410690205033|SAO LOURENCO|“-25.388173,-49.260771”|2016-06-25 19:43:55|833|3268.35|781|410690205040042|410690205033|SAO LOURENCO|“-25.388173,-49.260771”|2016-06-25 19:43:59|833|3268.35|781| |000002|260|410690205040042|410690205033|SAO LOURENCO|“-25.388173,-49.260771”|2016-06-25 19:43:59|833|3268.35|781|410690205010273|410690205014|AHU|“-25.392713,-49.260928”|2016-06-25 22:03:56|975|4211.39|936| |000003|260|410690205010273|410690205014|AHU|“-25.392713,-49.260928”|2016-06-25 22:03:56|975|4211.39|936|410690205010273|410690205014|AHU|“-25.392713,-49.260928”|2016-06-25 22:04:01|975|4211.39|936|
In the output, the column prefix defines whether it refers to the path origin (o.) or destination (d.). Each column is described below: