Name: spark-json-relay
Owner: Hammer Lab
Description: SparkListener that converts SparkListenerEvents to JSON and forwards them to an external service via RPC.
Created: 2015-06-09 17:32:19.0
Updated: 2017-10-18 07:03:40.0
Pushed: 2016-10-28 09:03:58.0
Homepage: null
Size: 280
Language: Scala
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
JsonRelay
is a SparkListener
that converts SparkListenerEvent
s to JSON and forwards them to an external service via RPC.
It is designed to be used with slim
, which consumes JsonRelay's emitted events and writes useful statistics about them to Mongo, from whence Spree serves up live-updating web pages.
With Spark >= 1.5.0 you can simply pass the following flags to your spark-shell
and spark-submit
commands:
--packages org.hammerlab:spark-json-relay:2.0.1
--conf spark.extraListeners=org.apache.spark.JsonRelay
If using earlier versions of Spark, you'll need to first download the JAR:
et https://repo1.maven.org/maven2/org/hammerlab/spark-json-relay/2.0.1/spark-json-relay-2.0.1.jar
Then, pass these flags to your spark-submit
or spark-shell
commands:
--driver-class-path spark-json-relay-2.0.1.jar
--conf spark.extraListeners=org.apache.spark.JsonRelay
That's it!
Two additional flags, --conf spark.slim.{host,port}
, specify the location JsonRelay
will attempt to connect and send events to.
JsonRelay
just piggybacks on Spark's JsonProtocol
for JSON serialization, with two differences:
appId
field to all events; this allows downstream consumers to process events from multiple Spark applications simultaneously / more easily over time.SparkListenerExecutorMetricsUpdate
events, which is omitted from Spark's JsonProtocol
in Spark prior to 1.5.0
(cf. SPARK-9036).Please file an issue if you have any questions about or problems using JsonRelay
!