GoogleCloudPlatform/spark-examples

Name: spark-examples

Owner: Google Cloud Platform

Description: Spark pipelines that correspond to a series of Dataflow examples.

Created: 2016-01-06 22:49:48.0

Updated: 2018-01-22 18:46:42.0

Pushed: 2018-01-09 00:38:53.0

Homepage:

Size: 27

Language: Java

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

spark-examples

This repository contains a running series of Spark examples for comparison with a matching series of Dataflow examples.

Before diving in here, you likely want to read Dataflow/Beam & Spark: A Programming Model Comparison.

The equivalent Dataflow code lives in GoogleCloudPlatform/DataflowJavaSDK-examples and is documented in Mobile Gaming Pipeline Examples.

For details on running these examples on a Google Cloud Dataproc cluster, please see this README.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.