appirio-tech/ap-emr-skills

Name: ap-emr-skills

Owner: Topcoder

Description: MapReduce job for Aggregating skills

Created: 2015-08-27 19:32:09.0

Updated: 2016-04-11 14:49:02.0

Pushed: 2015-11-06 01:19:57.0

Homepage: null

Size: 34356

Language: Java

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

ap-emr-skills

Packaged JARs that handle map reduce job(s) for aggregating skills

Mappers supported
  1. User Enetered Skills
  2. Skills from Challenges successfully participated in.
Running locally
Setup Hadoop Install Mac

http://zhongyaonan.com/hadoop-tutorial/setting-up-hadoop-2-6-on-mac-osx-yosemite.html

AWS CLI

Create Cluster:

emr create-cluster --name ?SkillsTest3? --enable-debugging --log-uri s3://supply-emr/skills/logs/skillstest3 --release-label emr-4.0.0 --applications Name=Hive Name=Hadoop --use-default-roles --ec2-attributes KeyName=topcoder-dev-vpc-app ?instance-type m3.xlarge -no-auto-terminate
Build Test
op jar target/ap-emr-skills-1.0-SNAPSHOT.jar com.appirio.mapreduce.skills.SkillsAggregator src/test/resources/skills/input/userEnteredSkills.txt src/test/resources/skills/input/challengeSkills.txt src/test/resources/skills/input/stackOverflowSkills.txt /tmp/skills
References
Sqoop

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.