LLNL/scraper

Name: scraper

Owner: Lawrence Livermore National Laboratory

Description: Python library for getting metadata from source code hosting tools

Created: 2016-10-18 16:44:32.0

Updated: 2018-03-23 23:33:38.0

Pushed: 2018-03-23 23:33:36.0

Homepage:

Size: 103

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Scraper

Scraper is a tool for scraping and visualizing open source data from GitHub.

Getting Started: Code.gov

Code.gov is a newly launched website of the US Federal Government to allow the People to access metadata from the governments custom developed software. This site requires metadata to function, and this Python library can help with that!

To get started, you will need a GitHub Personal Auth Token to make requests to the GitHub API. This should be set in your environment or shell rc file with the name GITHUB_API_TOKEN:

$ export GITHUB_API_TOKEN=XYZ

$ echo "export GITHUB_API_TOKEN=XYZ" >> ~/.bashrc

To generate a code.json file for your GitHub organization:

$ pip install -e .

$ scraper --agency <agency_name> --github-orgs <list of github org usernames ...>

# Example
$ scraper --agency DOE --github-orgs llnl

A full example of the resulting code.json file can be found here.

License

Scraper is released under an MIT license. For more details see the LICENSE file.

LLNL-CODE-705597


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.