bomboradata/findspark

Name: findspark

Owner: Bombora

Description: null

Created: 2016-07-01 18:44:10.0

Updated: 2016-07-01 18:44:11.0

Pushed: 2016-05-15 19:41:31.0

Homepage: null

Size: 15

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Find spark

PySpark isn't on sys.path by default, but that doesn't mean it can't be used as a regular library. You can address this by either symlinking pyspark into your site-packages, or adding pyspark to sys.path at runtime. findspark does the latter.

To initialize PySpark, just call

rt findspark
spark.init()

rt pyspark
 pyspark.SparkContext(appName="myAppName")

Without any arguments, the SPARK_HOME environmental variable will be used, and if that isn't set, other possible install locations will be checked. If you've installed spark with

brew install apache-spark

on OS X, the location /usr/local/opt/apache-spark/libexec will be searched.

Alternatively, you can specify a location with the spark_home argument.

spark.init('/path/to/spark_home')

To verify the automatically detected location, call

spark.find()

Findspark can add a startup file to the current IPython profile so that the enviornment vaiables will be properly set and pyspark will be imported upon IPython startup. This file is created when edit_profile is set to true.

hon --profile=myprofile
spark.init('/path/to/spark_home', edit_profile=True)

Findspark can also add to the .bashrc configuration file if it is present so that the enviornment variables will be properly set whenever a new shell is opened. This is enabled by setting the optional argument edit_rc to true.

spark.init('/path/to/spark_home', edit_rc=True)

If changes are persisted, findspark will not need to be called again unless the spark installation is moved.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.