SSH Logins
Most of the InfolabServers are only accessible within the Gates building network or on-campus. There are remote login machines for accessing them from outside of Gates, or off-campus, but logging in twice is an annoyance, and makes life hard when copying files, etc. So, here's how you can configure your ssh to make it feel like directly accessing them regardless of your location.
If you're using a Mac or Linux machine, add the following lines to your ~/.ssh/config file:
## InfoLab Servers (http://snap.stanford.edu/moin/InfolabServers) # remote login servers, directly accessible from outside world Host whale.stanford.edu shark.stanford.edu snap.stanford.edu skate.stanford.edu ilc0-ext.stanford.edu il-fs-1.stanford.edu bruce.stanford.edu ProxyCommand none # and the rest that should go through one of the remote login server Host il*.stanford.edu madmax*.stanford.edu rulk.stanford.edu hulk.stanford.edu rocky.stanford.edu rambo.stanford.edu bruce.stanford.edu zarya.stanford.edu eel.stanford.edu snap.stanford.edu shark.stanford.edu skate.stanford.edu whale.stanford.edu silk.stanford.edu ProxyCommand ssh -q ${CS_PROXY_HOST:-whale}.stanford.edu exec nc %h %p ## Share connection across sessions to the same host as much as possible ControlMaster auto ControlPath ~/.ssh/master-%r@%h:%p
This will make all your ssh commands go through the remote login machine "whale" by default, and you don't have to worry about where you're accessing them from. When you need, you can specify a different remote login server on the fly over the CS_PROXY_HOST environment variable as follows:
CS_PROXY_HOST=shark ssh server ...
When you want to transfer large amount of data directly from/to the server, you might want to bypass this proxy and session sharing configuration. You can either comment out the config lines above for a while, or disable those options with command-line arguments as follows:
ssh -o ProxyCommand=none -o ControlMaster=no server ... rsync -e 'ssh -o ProxyCommand=none -o ControlMaster=no' ... server: scp -o ProxyCommand=none -o ControlMaster=no ... server:
Hadoop Job/Task Tracker web UI
Access to the InfolabClusterHadoop's Job Tracker and Task Trackers' web interface is restricted to Gates network. However, you can access them from practically anywhere using SSH port forwarding as follows. Just add a local port forwarding option as you login to the node.
ssh iln29.stanford.edu -L50030:localhost:50030
Now you can access it via your http://localhost:50030.
However, this won't solve the problem of accessing individual Task Tracker logs. To have full access to them, you should enable SSH's dynamic forwarding as you login to the head node as follows:
ssh iln29.stanford.edu -D1080
And, set your browser's SOCKS proxy to localhost:1080.
References
- ssh_config(5) man page