Name: spark-ui-proxy
Owner: SURFsara
Description: Lightweight proxy to expose the UI of an Apache Spark cluster that is behind a firewall
Forked from: aseigneurin/spark-ui-proxy
Created: 2018-03-16 11:16:21.0
Updated: 2018-03-16 11:16:23.0
Pushed: 2017-11-17 18:59:08.0
Homepage: null
Size: 20
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
If you are running a Spark Standalone cluster behind a firewall (let's say it is running on Amazon AWS), you might have issues accessing the UI of your cluster, especially because each worker has its own UI, making it difficult if not impossible to reroute all the ports using only SSH tunnels.
Firewall
|
| ------------------------------
| | Spark Master |
| | e.g. http://10.0.0.1:8080 |
| ------------------------------
|
------------------ | ------------------------------
Your computer | ----->X | Spark Worker |
g. 192.168.0.10 | | | e.g. http://10.0.0.2:8080 |
------------------ | ------------------------------
|
| ------------------------------
| | Spark Worker |
| | e.g. http://10.0.0.3:8080 |
| ------------------------------
|
This Python script creates a lightweight HTTP server that proxies all the requests to your Spark Master and Spark Workers. All you have to do is create a single SSH tunnel to this proxy, and the proxy will forward all the requests for you. All the links between the nodes will be functional.
Firewall
|
| ------------------------------
| | Spark Master |
| -> | e.g. http://10.0.0.1:8080 |
| / ------------------------------
| /
------------------ tunnel ------------------------ / ------------------------------
Your computer | -----------> | spark-ui-proxy | ----> | Spark Worker |
g. 192.168.0.10 | :9999 :9999| http://10.0.0.1:9999 | \ | e.g. http://10.0.0.2:8080 |
------------------ | ------------------------ \ ------------------------------
| \
| \ ------------------------------
| ->| Spark Worker |
| | e.g. http://10.0.0.3:8080 |
| ------------------------------
|
Let's say the Spark Master has its UI running on localhost:8080
(localhost
refers to the Spark Master node), and we want to access that UI on localhost:9999
(localhost
here refers to your computer).
Start by creating an SSH tunnel from your computer to the Spark Master (but it could be to any of the nodes):
h -L 9999:localhost:9999 <public IP/name of the node>
On this node, run the Python proxy:
thon spark-ui-proxy.py localhost:8080 9999
You can stop the proxy at any time by hitting Ctrl+C.
Alternatively, you may run the proxy in background:
hup python spark-ui-proxy.py localhost:8080 9999 &
You can also run it with docker:
cker build -t spark-ui-proxy .
cker run -d --net host spark-ui-proxy localhost:8080 9999
Now, on your computer, open http://localhost:9999 and you should see the UI of your Spark cluster!