Access Apache Spark Web UI when cluster is running on closed port server machines
When you have a Apache spark cluster running on a server were ports are closed you cannot simply access the Spark master web UI by localhost:8080. The solution to this is to use SSH Tunnels. Which is pretty straight forward.
Note: You can checkout my blog post on how to setup a Spark standalone cluster locally (The steps are pretty much the same when you are setting it up on a server) - How to set up a Apache Spark cluster in your local machine
Scenario 1:
The first most basic scenario would be if you have direct ssh access to the server where the Apache spark master is running on. The all you have to do is run the following command in a terminal window on your local machine ( Laptop or desktop that you use) after you start the master in the server machine.
Once you have run this command you can access the Spark Web UI by simply going to "http://localhost:8080/" on your web browser. Likewise you might want to create SSH tunnels for other ports that are needed when using the Spark Web UI, such as 4040.
Scenario 2:
If you do not have direct access to the server that is running the Apache Spark master you can do a multilevel SSH Tunnel. This might the case if you are running the cluster in a super computer where you only have access to the login node of the super computer and you are running the Spark master on a compute node, which you can only access through the login node. When this is the case you can simply do the same SSH tunnel in two steps. First after you start the Spark master on the compute node run the following command to create a SSH tunnel for port 8080
Then run the following command from your local machine ( Laptop or desktop that you use).
Then you can access the Web UI just as before by going to "http://localhost:8080/".
Update: Based on Saliya's comment you can do the two steps in a single command. An example of the command is as follows. You just need to run this from your local machine
Note: You can checkout my blog post on how to setup a Spark standalone cluster locally (The steps are pretty much the same when you are setting it up on a server) - How to set up a Apache Spark cluster in your local machine
Scenario 1:
The first most basic scenario would be if you have direct ssh access to the server where the Apache spark master is running on. The all you have to do is run the following command in a terminal window on your local machine ( Laptop or desktop that you use) after you start the master in the server machine.
$ ssh -L 8080:localhost:8080 username@your.server.name
Once you have run this command you can access the Spark Web UI by simply going to "http://localhost:8080/" on your web browser. Likewise you might want to create SSH tunnels for other ports that are needed when using the Spark Web UI, such as 4040.
Scenario 2:
If you do not have direct access to the server that is running the Apache Spark master you can do a multilevel SSH Tunnel. This might the case if you are running the cluster in a super computer where you only have access to the login node of the super computer and you are running the Spark master on a compute node, which you can only access through the login node. When this is the case you can simply do the same SSH tunnel in two steps. First after you start the Spark master on the compute node run the following command to create a SSH tunnel for port 8080
$ ssh -L 8080:localhost:8080 username@compute.node.name
Then run the following command from your local machine ( Laptop or desktop that you use).
$ ssh -L 8080:localhost:8080 username@your.server.name
Then you can access the Web UI just as before by going to "http://localhost:8080/".
Update: Based on Saliya's comment you can do the two steps in a single command. An example of the command is as follows. You just need to run this from your local machine
$ ssh -L 8080:
compute.node.name:8080 username@your.server.name
GREAT WORK AND NICE SITE.
ReplyDelete