Name: dataproc-pubsub-spark-streaming
Owner: Google Cloud Platform
Description: null
Created: 2018-05-09 15:35:35.0
Updated: 2018-05-21 18:31:50.0
Pushed: 2018-05-21 18:31:49.0
Homepage: null
Size: 30
Language: Scala
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
In this tutorial you learn how to deploy an Apache Spark streaming application on Cloud Dataproc and process messages from Cloud Pub/Sub in near real-time. The system you build in this scenario generates thousands of random tweets, identifies trending hashtags over a sliding window, saves results in Cloud Datastore, and displays the results on a web page.
Please refer to the related article for all the steps to follow in this tutorial: [INSERT LINK WHEN PUBLISHED]
Contents of this repository:
http_function
: Javascript code for the HTTP function deployed on Cloud Functions.spark
: Scala code for the Apache Spark streaming application.tweet-generator
: Python code for the randomized tweet generator.To run the tests:
park
test