Locked History Actions

Diff for "InfolabClusterComputeHowtoVariables"

Differences between revisions 3 and 4
Revision 3 as of 2012-10-17 01:20:40
Size: 4156
Editor: akrevl
Comment:
Revision 4 as of 2012-10-17 02:21:41
Size: 6566
Editor: akrevl
Comment:
Deletions are marked like this. Additions are marked like this.
Line 35: Line 35:

== Static parameters ==

In this example we show how to run our program with parameters if these parameters are static and can be coded into the submission script. Perhaps not very useful, but here it goes anyway...

{{{#!highlight bash
#!/bin/bash
#PBS -N SingleCoreVariablesJob
#PBS -l nodes=1:ppn=1
#PBS -l walltime=00:01:00

/usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCore/SingleCoreVariables.py myarg1 myarg2 myarg3
}}}

Once our job completes, the output is:

{{{
~/ $ cat SingleCoreVariablesJob.o4656
Started: 2012-10-16 18:51:31 Finished: 2012-10-16 18:51:41 Host: iln28 User: akrevl
My arguments:
['/afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.py', 'myarg1', 'myarg2', 'myarg3']
}}}

== Using environment variables: -V ==

What if we would like to pass arguments along with the qsub command? We can try a script like this:

{{{#!highlight bash
#!/bin/bash
#PBS -N SingleCoreVariablesJob
#PBS -l nodes=1:ppn=1
#PBS -l walltime=00:01:00

/usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCore/SingleCoreVariables.py $1 $2 $3
}}}

However if we try to run the ''qsub'' with our three arguments it will complain that we have supplied wrong arguments (actually it just prints out the usage information, but we should take the hint).

{{{
~/ $ qsub -V $HOME/tutorial/SingleCoreVariables/SingleCoreVariables.qsub.sh myarg1 myarg2 myarg3
usage: qsub [-a date_time] [-A account_string] [-b secs]
      [-c [ none | { enabled | periodic | shutdown |
      depth=<int> | dir=<path> | interval=<minutes>}... ]
...
}}}

Instead of passing arguments by the command line, we can pass them as environment variables. Let's export our arguments first:

{{{#!highlight bash
~/ $ export qsubarg1="myarg1"
~/ $ export qsubarg2="myarg2"
~/ $ export qsubarg3="myarg3"
}}}

Now let's make the adjustments to the submission script. We need to use qsubargX:

{{{#!highlight bash
#!/bin/bash
#PBS -N SingleCoreVariablesJob
#PBS -l nodes=1:ppn=1
#PBS -l walltime=00:01:00

/usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.py $qsubarg1 $qsubarg2 $qsubarg3
}}}

Let's try and submit this to the cluster (do not forget the -V switch):

{{{
qsub -V /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.qsub.sh
}}}

This HOWTO describes how to pass arguments to your program running on the cluster Infolab Compute Cluster.

The program

We are going to use the script from InfolabClusterComputeHowtoSingle that we modified just so it prints out the arguments passed to it. You can download the script here SingleCoreVariables.py.

   1 #!/usr/bin/python2.7
   2 
   3 import socket, datetime, time, getpass, sys
   4 
   5 start = datetime.datetime.now()
   6 hostname = socket.gethostname().split('.')[0]
   7 username = getpass.getuser()
   8 time.sleep(10)
   9 end = datetime.datetime.now()
  10 
  11 dfmt = "%Y-%m-%d %H:%M:%S"
  12 print "Started: %s Finished: %s Host: %s User: %s" % (start.strftime(dfmt), end.strftime(dfmt), hostname, username)
  13 print "My arguments:"
  14 print sys.argv

The script starts, records the current time, figures out the hostname it is running on and the username it is running as. Then it sleeps for 10 seconds (so we at least have some impact on the cluster), records the time again and prints out a string that may look a little something like this (if we called it with myarg1, myarg2 and myarg3, of course):

Started: 2012-10-16 18:19:47 Finished: 2012-10-16 18:19:57 Host: ilhead1 User: akrevl
My arguments:
['./SingleCoreVariables.py', 'myarg1', 'myarg2', 'myarg3']

The submission script

Static parameters

In this example we show how to run our program with parameters if these parameters are static and can be coded into the submission script. Perhaps not very useful, but here it goes anyway...

   1 #!/bin/bash
   2 #PBS -N SingleCoreVariablesJob
   3 #PBS -l nodes=1:ppn=1
   4 #PBS -l walltime=00:01:00
   5 
   6 /usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCore/SingleCoreVariables.py myarg1 myarg2 myarg3

Once our job completes, the output is:

~/ $ cat SingleCoreVariablesJob.o4656
Started: 2012-10-16 18:51:31 Finished: 2012-10-16 18:51:41 Host: iln28 User: akrevl
My arguments:
['/afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.py', 'myarg1', 'myarg2', 'myarg3']

Using environment variables: -V

What if we would like to pass arguments along with the qsub command? We can try a script like this:

   1 #!/bin/bash
   2 #PBS -N SingleCoreVariablesJob
   3 #PBS -l nodes=1:ppn=1
   4 #PBS -l walltime=00:01:00
   5 
   6 /usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCore/SingleCoreVariables.py $1 $2 $3

However if we try to run the qsub with our three arguments it will complain that we have supplied wrong arguments (actually it just prints out the usage information, but we should take the hint).

~/ $ qsub -V $HOME/tutorial/SingleCoreVariables/SingleCoreVariables.qsub.sh myarg1 myarg2 myarg3
usage: qsub [-a date_time] [-A account_string] [-b secs]
      [-c [ none | { enabled | periodic | shutdown |
      depth=<int> | dir=<path> | interval=<minutes>}... ]
...

Instead of passing arguments by the command line, we can pass them as environment variables. Let's export our arguments first:

   1 ~/ $ export qsubarg1="myarg1"
   2 ~/ $ export qsubarg2="myarg2"
   3 ~/ $ export qsubarg3="myarg3"

Now let's make the adjustments to the submission script. We need to use qsubargX:

   1 #!/bin/bash
   2 #PBS -N SingleCoreVariablesJob
   3 #PBS -l nodes=1:ppn=1
   4 #PBS -l walltime=00:01:00
   5 
   6 /usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.py $qsubarg1 $qsubarg2 $qsubarg3

Let's try and submit this to the cluster (do not forget the -V switch):

qsub -V /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.qsub.sh

Now that we got the program up and running let's log into the submission node ilhead1 and prepare a submission script. You can download the script here: SingleCore.qsub.sh

   1 #!/bin/bash
   2 #PBS -N SingleCoreJob
   3 #PBS -l nodes=1:ppn=1
   4 #PBS -l walltime=00:01:00
   5 
   6 /usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCore/SingleCore.py

We are using a friendly name SingleCoreJob for our submission and we are limiting our job to a single node and a single CPU cure (based on what our script does, there really is no reason to ask for more). We are also limiting the wall clock time to 1 minute. Since our program only sleeps for 10 seconds a 1 minute wall time seems more than enough for the job to complete.

Submit the job

Nothing left to do but submit the job to the cluster with qsub:

qsub -V /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCore/SingleCore.qsub.sh

If we submitted the job successfully, the resource manager should reply with with the ID of the job and the name of the headnode:

4651.ilhead1.stanford.edu

Check on the job

While the job is running, you can check on it with qstat and showq commands. Please be patient with the showq command as it tends to return timeouts when a lot of jobs are in the queue.

~/ $ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
4651.ilhead1               SingleCoreJob    akrevl                 0 R test

~/ $ showq
ACTIVE JOBS--------------------
JOBNAME            USERNAME      STATE  PROC   REMAINING            STARTTIME

4651                 akrevl    Running     1    00:01:00  Tue Oct 16 17:19:29

     1 Active Job        1 of  896 Processors Active (0.11%)
                         1 of   28 Nodes Active      (3.57%)

The results

Once the job is finished it should deposit two files into the directory we ran qsub from:

  • SingleCoreJob.e4651: copy of the standard error stream

  • SingleCoreJob.o4651: copy of the standard output stream

Let's see what does our directory contain:

~/ $ ls /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCore
SingleCoreJob.e4651 
SingleCoreJob.o4651
SingleCore.py
SingleCore.qsub.sh

Now let's see the content of those files:

~/ $ cat SingleCoreJob.e4651
~/ $ cat SingleCoreJob.o4651
Started: 2012-10-16 17:19:29 Finished: 2012-10-16 17:19:39 Host: iln28 User: akrevl

Excellent, the standard error file is empty and the standard output tells us that our job ran on node iln28 and it finished (as expeted) in 10 seconds.