<> = Infrastructure = The InfolabComputeServersStats lists our big memory servers. These are first come first serve (nobody is managing the resources). If you are on a tight deadline or you feel that somebody is hogging the machine talk to them or talk to your local sysadmin. You can access this machines by logging in via ssh with your CSID. Here's an example: {{{ ~$ ssh tommy@madmax5 tommy@madmax5's password: Last login: Sun Oct 4 10:54:11 2015 from whale.stanford.edu tommy@madmax5:~$ }}} Using Windows? Use [[http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html|PuTTY]] or [[https://www.cygwin.com/|Cygwin]]. ==== Which server to pick? ==== The one that's feeling lonely (shows zero or little utilization). As a general rule of thumb Chris' students should go for '''raidersX''' machines and Jure's students should go for '''madmaxX''' machines. ==== How do I set up passwordless/key based ssh? ==== You '''do not'''. It's a limitation of our configuration. You need to supply your password on login. Of course you can be resourceful and do: {{{ ~$ ssh tommy@madmax5 tommy@madmax5's password: Last login: Tue Oct 6 00:52:52 2015 from whale.stanford.edu tommy@madmax5:~$ ssh madmax3 Last login: Fri Oct 2 09:05:37 2015 from whale.stanford.edu tommy@madmax3:~$ }}} Did you notice that we never got asked for a password when logging into madmax3? Magic, huh? It could be just Kerberos. ==== But I hate typing in my password all the time! ==== No problem. Install Kerberos on your workstation. Here are some installation notes: KerberosMac | KerberosWindows. ==== Can I login from home? ==== Depends where home is. Logins over ssh should work from almost<> any network on Stanford Campus. If you're not on campus you have three options. '''1) whale''' Login via ssh to whale.stanford.edu first. Whale is on one of our networks so you can ssh to other machines from there. Here's an example session. {{{ ~$ ssh tommy@whale.stanford.edu tommy@whale.stanford.edu's password: Last login: Sun Oct 4 10:53:42 2015 from c-76-111-212-54.hsd1.ca.comcast.net tommy@whale:~$ ssh madmax3 Last login: Tue Oct 6 01:17:08 2015 from madmax5.stanford.edu tommy@madmax3:~$ hostname madmax3.stanford.edu tommy@madmax3:~$ exit logout Connection to madmax3 closed. tommy@whale:~$ exit logout Connection to whale.stanford.edu closed. }}} Hey... and this has an added benefit of only typing your password once. Yay! '''2) Stanford VPN''' Setup the [[https://itservices.stanford.edu/service/vpn|Stanford VPN]] connection, establish the connection and you're done. '''3) Infolab VPN''' Check out the instructions on the [[VPN]] page. The benefit of this VPN service is that it looks like regular https traffic... so it should work from most hotels, airports, etc. = Storage options = || '''Storage option''' || '''Mount point''' || '''Good for''' || '''Speed''' || '''Backed up?''' || || Your home directory || /afs/cs/u/tommy || Stuff that matters, e.g. results, code || Not really || Daily || || Local hard disk || /lfs/local/0 || Temporary files, intermediate results || Around 150 MB/s || No || || Network storage || /dfs/scratchX || Datasets, things you need accessible across multiple servers || Up to 450 MB/s, but shared! || No || It is not common but a server could have multiple local volumes (think of it as having multiple disks) so check if there is an /lfs/local/1 if you're running out of space. = Long running sessions = So... You have some sweet python code that takes 2 days to run. Madmax5 is feeling lonely and you figure you'd just run your sweet.py there. Easy, right: {{{ ~$ ssh tommy@madmax5 tommy@madmax5:~$ python2.7 sweet.py Starting sweet pie... 1 minutes... 2 minutes... ^Z [1]+ Stopped python2.7 sweet.py tommy@madmax5:~$ bg 1 [1]+ python2.7 sweet.py & tommy@madmax5:~$ exit Connection to madmax5 closed. }}} This should be all good when you log back in to madmax5, right? Not really... Even if the job survives you can't really re-attach it so there's no easy way to see what the job is up to<>. That's where [[https://www.gnu.org/software/screen/|screen]] & [[https://tmux.github.io/|tmux]] come in. What?! Think of the two as virtual terminals... You know how you can have multiple tabs open in some applications? Think of screen & tmux as tabs for your ssh session. Here's a quick session: {{{ ~$ ssh tommy@madmax5 # Let's start a screen session (open a new tabbed thingie) tommy@madmax5:~$ screen -s myScreen # Run something in the first tab tommy@madmax5:~$ uptime 14:20:32 up 145 days, 8:48, 10 users, load average: 33.03, 33.07, 33.08 # Create a new tab by pressing Ctrl+A, C (C is for create... and also for cookie) tommy@madmax5:~$ python2.7 sweet.py Starting sweet pie... 1 minutes... 2 minutes... # Switch between the tabs by pressing Ctrl+A, N (for next) or Ctrl+A, P (for previous) or Ctrl+A, " (that brings up a list) # Want to "minimize" screen and come back later? Press Ctrl+A, D (to detach) }}} Great, now we have tabs. What's so good about them? They stay open even after you log out. How do you get back to them? {{{ ~$ ssh tommy@madmax5 # Bring back the session we detached tommy@madmax5:~$ screen -x myScreen ... 10 minutes... 11 minutes... 12 minutes... 13 minutes... 14 minutes... 15 minutes... }}} ==== I lost my permissions when when I re-attached !@#$@#$&^!$!$!! ==== It has to do with Kerberos... and since somebody else already wrote a guide on it, here's a link: ScreenKerberos. Same thing applies to tmux. = Compute cluster = You want to run sweet.py two hundred times, possibly with different parameters. You could make a script that would ssh into a bunch of machines, open some screens and run some code. But luckily password-less ssh doesn't work and controlling screen from a script is not that easy. Let's put together a submission script for the compute cluster: {{{ #PBS -l nodes=1:ppn=1 #PBS -N SweetJob python2.7 /dfs/scratch0/tommy/sweet.py }}} We saved that script to /dfs/scratch0/tommy/run_sweet.sh. Now we can run a job with 200 of these on the cluster with a single command (make sure to login to ilhead1 first): {{{ ~$ ssh tommy@ilhead1 tommy@ilhead1:~$ qsub -t 1-200 /dfs/scratch0/tommy/run_sweet.sh }}} You can check on your job from time to time with qstat: {{{ qstat -a }}} You'll get a bunch of *.o* and *.e* files in the directory you submitted the job from. These contain standard output and standard error output for all the tasks. {{{#!wiki note '''Pro tip''' Submit the job from /dfs/... and put all your code and dependencies there too. }}} Read more about the cluster here: InfolabClusterCompute = Hadoop cluster = What's all the hubub? It's distributed and it spreads the data blocks between multiple nodes and the resource manager/scheduler/whateveryouwanttocallit is aware of the data locations. Which means we can "grep" through a 50TB dataset in about half an hour. Cool, right? ==== How do I get access? ==== You'll need a CSID and a home directory on the HDFS. You probably already have your CSID (if you don't, congrats for reading this through anyway). Your sysadmin can take care of the home directory (if you ask nicely). ==== Where do I? How do I? ==== This are the nodes that have the hadoop packages installed: {{{ madmax madmax2 madmax3 madmax4 madmax5 }}} Here's how you list the contents of your HDFS home directory: {{{ hadoop fs -ls /user/tommy }}} Here's how you submit a job to the cluster: {{{ hadoop jar ... }}} ==== More Hadoop info ==== * Examples and usage: InfolabClusterHadoop * Current cluster status: [[ilHadoopStatus]] * Current cluster statistics: [[ilHadoopStats]] * HDFS info: http://ilhadoop1.stanford.edu:50070/dfshealth.html * Application Tracker: http://ilhadoop1.stanford.edu:8088/cluster = Q&A = ==== Who is Tommy? ==== Tommy is his name place holding is his game. His friends are Bill Oddie, Private Tentpeg, and Airman Snuffy. Tommy likes to secretly volunteer for this wiki. The agencies involved can neither confirm nor deny that Corporal Schumuckatelli is involved in this matter. == Footnotes ==