TransparencyToolkit/TSJobCrawler

Name: TSJobCrawler

Owner: Transparency Toolkit

Description: Collects listings for jobs that require security clearance.

Created: 2017-03-06 21:42:09.0

Updated: 2017-04-20 17:59:45.0

Pushed: 2017-03-14 13:26:52.0

Homepage: null

Size: 83

Language: Ruby

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

This is a crawler for job listings that require security clearance.

To run-

t = TSJobCrawler.new(“search term” (or nil), request_manager, cm_hash or nil)

t.crawl_jobs

For example-

Headless.ly do

r = RequestManager.new(nil, [0, 0], 1)

t = TSJobCrawler.new(nil, r, nil)

t.crawl_jobs

File.write(“test.json”, t.gen_json)

end

If you input nil for the search term, it downloads as many job listings as possible. Unless you have a lot of RAM, you should run it through Harvester if you want to download as many listings as possible as then you can take advantage of incremental result reporting.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.