spatialdev/single-node-hadoop

Name: single-node-hadoop

Owner: SpatialDev

Description: Pseudo-distributed Hadoop testing environment via Vagrant and Ansible

Forked from: AlanHohn/single-node-hadoop

Created: 2017-01-09 22:04:59.0

Updated: 2017-01-09 22:05:02.0

Pushed: 2016-03-05 23:35:43.0

Homepage: null

Size: 659

Language: Shell

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Single Node Hadoop

This repository includes a Vagrant VM that uses Ansible to install and start Hadoop in psudo-distributed mode.

To get started, install Vagrant, Ansible, and VirtualBox. Then run vagrant up from inside the repository. The VM is configured to use an Ubuntu wily64 box. Hadoop will be downloaded from a mirror and installed. At the moment Hadoop 2.6.3 will be installed; you can configure the version by setting hadoop_version before running the first vagrant up or before running vagrant provision.

rt hadoop_version=2.7.1

After install, you can visit the [HDFS Name Node][nn] and [YARN Application Manager][yarn] pages to see the running services.

The wordcount directory has an example application from the Hadoop docs. To use, first bring up the VM, and run vagrant ssh to get a shell. Then:

vagrant/wordcount
ild.sh
.sh
n.sh

The application will show up in the application manager and will also print status and information on the console.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.