Locked History Actions

Diff for "InfolabClusterComputeHowtoVariables"

Differences between revisions 1 and 6 (spanning 5 versions)
Revision 1 as of 2012-10-17 01:15:20
Size: 4619
Editor: akrevl
Comment:
Revision 6 as of 2012-10-17 02:34:34
Size: 5774
Editor: akrevl
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
This HOWTO describes how to pass arguments to your program running on the cluster [[InfolabClusterCompute|Infolab Compute Cluster]]. This HOWTO describes how to pass arguments to your program running on the cluster [[InfolabClusterCompute|Infolab Compute Cluster]]. We assume that you are already familiar with the basic principles of submitting jobs to the cluster. If that is not the case please see this tutorial first: [[InfolabClusterComputeHowtoSingle]].
Line 7: Line 7:
We are going to use the script from XX that we modified just so it prints out the arguments passed to it. You can download the script here [[attachment:SingleCore.py]]. We are going to use the script from [[InfolabClusterComputeHowtoSingle]] that we modified just so it prints out the arguments passed to it. You can download the script here [[attachment:SingleCoreVariables.py]].
Line 12: Line 12:
import socket, datetime, time, getpass import socket, datetime, time, getpass, sys
Line 22: Line 22:
print "My arguments:"
print sys.argv
Line 24: Line 26:
The script starts, records the current time, figures out the hostname it is running on and the username it is running as. Then it sleeps for 10 seconds (so we at least have some impact on the cluster),   records the time again and prints out a string that may look a little something like this: The script starts, records the current time, figures out the hostname it is running on and the username it is running as. Then it sleeps for 10 seconds (so we at least have some impact on the cluster), records the time again and prints out a string that may look a little something like this (if we called it with myarg1, myarg2 and myarg3, of course):
Line 27: Line 29:
Started: 2012-10-16 15:56:55 Finished: 2012-10-16 15:57:05 Host: ilhead1 User: akrevl
}}}

It's a good idea to check if the program will run on the target platform. It doesn't make much difference for a Python script, but if you were running a C binary it's worth checking if it runs on the AMD platform. This is where '''ild1''' comes in. The development node '''ild1''' is set up in the same way as the cluster nodes are. So let's test the script on ild1:

{{{
/usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCore/SingleCore.py
}}}

Note that we are using a full path both to the python executable and to the Python script. The result is as expected:

{{{
Started: 2012-10-16 17:04:44 Finished: 2012-10-16 17:04:54 Host: ild1 User: akrevl
Started: 2012-10-16 18:19:47 Finished: 2012-10-16 18:19:57 Host: ilhead1 User: akrevl
My arguments:
['./SingleCoreVariables.py', 'myarg1', 'myarg2', 'myarg3']
Line 44: Line 36:
Now that we got the program up and running let's log into the submission node '''ilhead1''' and prepare a submission script. You can download the script here: [[attachment:SingleCore.qsub.sh]] == Static parameters ==

In this example we show how to run our program with parameters if these parameters are static and can be coded into the submission script. Perhaps not very useful, but here it goes anyway...
Line 48: Line 42:
#PBS -N SingleCoreJob #PBS -N SingleCoreVariablesJob
Line 52: Line 46:
/usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCore/SingleCore.py /usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCore/SingleCoreVariables.py myarg1 myarg2 myarg3
Line 55: Line 49:
We are using a friendly name ''SingleCoreJob'' for our submission and we are limiting our job to a single node and a single CPU cure (based on what our script does, there really is no reason to ask for more). We are also limiting the wall clock time to 1 minute. Since our program only sleeps for 10 seconds a 1 minute wall time seems more than enough for the job to complete.

= Submit the job =

Nothing left to do but submit the job to the cluster with ''qsub'':
Once our job completes, the output is:
Line 62: Line 52:
qsub -V /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCore/SingleCore.qsub.sh ~/ $ cat SingleCoreVariablesJob.o4656
Started: 2012-10-16 18:51:31 Finished: 2012-10-16 18:51:41 Host: iln28 User: akrevl
My arguments:
['/afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.py', 'myarg1', 'myarg2', 'myarg3']
Line 65: Line 58:
If we submitted the job successfully, the resource manager should reply with with the ID of the job and the name of the headnode: == -V: using environment variables ==

What if we would like to pass arguments along with the qsub command? We can try a script like this:

{{{#!highlight bash
#!/bin/bash
#PBS -N SingleCoreVariablesJob
#PBS -l nodes=1:ppn=1
#PBS -l walltime=00:01:00

/usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCore/SingleCoreVariables.py $1 $2 $3
}}}

However if we try to run the ''qsub'' with our three arguments it will complain that we have supplied wrong arguments (actually it just prints out the usage information, but we should take the hint).
Line 68: Line 74:
4651.ilhead1.stanford.edu ~/ $ qsub -V $HOME/tutorial/SingleCoreVariables/SingleCoreVariables.qsub.sh myarg1 myarg2 myarg3
usage: qsub [-a date_time] [-A account_string] [-b secs]
      [-c [ none | { enabled | periodic | shutdown |
      depth=<int> | dir=<path> | interval=<minutes>}... ]
...
Line 71: Line 81:
= Check on the job = Instead of passing arguments by the command line, we can pass them as environment variables. Let's export our arguments first:
Line 73: Line 83:
While the job is running, you can check on it with ''qstat'' and ''showq'' commands. Please be patient with the ''showq'' command as it tends to return timeouts when a lot of jobs are in the queue. {{{#!highlight bash
~/ $ export qsubarg1="myarg1"
~/ $ export qsubarg2="myarg2"
~/ $ export qsubarg3="myarg3"
}}}

Now let's make the adjustments to the submission script. We need to use qsubargX ((you can download the script here [[attachment:SingleCoreVariables.qsub.sh]])):

{{{#!highlight bash
#!/bin/bash
#PBS -N SingleCoreVariablesJob
#PBS -l nodes=1:ppn=1
#PBS -l walltime=00:01:00

/usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.py $qsubarg1 $qsubarg2 $qsubarg3
}}}

Let's try and submit this to the cluster (do not forget the -V switch):
Line 76: Line 103:
~/ $ qstat
Job id Name User Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
4651.ilhead1 SingleCoreJob akrevl 0 R test
qsub -V /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.qsub.sh
Line 82: Line 106:
''qsub'' made sure that all the environment variables were passed to the execution node and our program ran with the provided arguments.
Line 83: Line 109:
~/ $ showq
ACTIVE JOBS--------------------
JOBNAME USERNAME STATE PROC REMAINING STARTTIME

4651 akrevl Running 1 00:01:00 Tue Oct 16 17:19:29

     1 Active Job 1 of 896 Processors Active (0.11%)
                         1 of 28 Nodes Active (3.57%)
~/ $ cat SingleCoreVariablesJob.o4657
Started: 2012-10-16 19:21:13 Finished: 2012-10-16 19:21:23 Host: iln28 User: akrevl
My arguments:
['/afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.py', 'myarg1', 'myarg2', 'myarg3']
Line 93: Line 115:
= The results = == -v: listing the variables ==
Line 95: Line 117:
Once the job is finished it should deposit two files into the directory we ran qsub from: Another way of passing the arguments to ''qsub'' is to just list them as key=value pairs. Let's keep the submission script we used in the previous example (you can download the script here [[attachment:SingleCoreVariables.qsub.sh]]):
Line 97: Line 119:
 * '''!SingleCoreJob.e4651''': copy of the standard error stream
 * '''!SingleCoreJob.o4651''': copy of the standard output stream
{{{#!highlight bash
#!/bin/bash
#PBS -N SingleCoreVariablesJob
#PBS -l nodes=1:ppn=1
#PBS -l walltime=00:01:00
Line 100: Line 125:
Let's see what does our directory contain: /usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.py $qsubarg1 $qsubarg2 $qsubarg3
}}}

We can list the argument values as part of the ''qsub'' command using the -v switch (we could actually omit the -V switch, but we just got used to it, so why bother):
Line 103: Line 131:
~/ $ ls /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCore
SingleCoreJob.e4651
Sing
leCoreJob.o4651
SingleCore.py
SingleCore
.qsub.sh
qsub -V -v qsubarg1="myarg1",qsubarg2="myarg2",qsubarg3="myarg3" /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.qsub.sh
Line 110: Line 134:
Now let's see the content of those files: ''qsub'' made sure that the specified arguments were available on the execution node and our program used them as you can see in the following listing:
Line 113: Line 137:
~/ $ cat SingleCoreJob.e4651
~/ $ cat SingleCoreJob.o4651
Started: 2012-10-16 17:19:29 Finished: 2012-10-16 17:19:39 Host: iln28 User: akrevl
~/ $ cat SingleCoreVariablesJob.o4658
Started: 2012-10-16 19:28:55 Finished: 2012-10-16 19:29:05 Host: iln28 User: akrevl
My arguments:
['/afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.py', 'myarg1', 'myarg2', 'myarg3']
Line 117: Line 142:

Excellent, the standard error file is empty and the standard output tells us that our job ran on node iln28 and it finished (as expeted) in 10 seconds.

This HOWTO describes how to pass arguments to your program running on the cluster Infolab Compute Cluster. We assume that you are already familiar with the basic principles of submitting jobs to the cluster. If that is not the case please see this tutorial first: InfolabClusterComputeHowtoSingle.

The program

We are going to use the script from InfolabClusterComputeHowtoSingle that we modified just so it prints out the arguments passed to it. You can download the script here SingleCoreVariables.py.

   1 #!/usr/bin/python2.7
   2 
   3 import socket, datetime, time, getpass, sys
   4 
   5 start = datetime.datetime.now()
   6 hostname = socket.gethostname().split('.')[0]
   7 username = getpass.getuser()
   8 time.sleep(10)
   9 end = datetime.datetime.now()
  10 
  11 dfmt = "%Y-%m-%d %H:%M:%S"
  12 print "Started: %s Finished: %s Host: %s User: %s" % (start.strftime(dfmt), end.strftime(dfmt), hostname, username)
  13 print "My arguments:"
  14 print sys.argv

The script starts, records the current time, figures out the hostname it is running on and the username it is running as. Then it sleeps for 10 seconds (so we at least have some impact on the cluster), records the time again and prints out a string that may look a little something like this (if we called it with myarg1, myarg2 and myarg3, of course):

Started: 2012-10-16 18:19:47 Finished: 2012-10-16 18:19:57 Host: ilhead1 User: akrevl
My arguments:
['./SingleCoreVariables.py', 'myarg1', 'myarg2', 'myarg3']

The submission script

Static parameters

In this example we show how to run our program with parameters if these parameters are static and can be coded into the submission script. Perhaps not very useful, but here it goes anyway...

   1 #!/bin/bash
   2 #PBS -N SingleCoreVariablesJob
   3 #PBS -l nodes=1:ppn=1
   4 #PBS -l walltime=00:01:00
   5 
   6 /usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCore/SingleCoreVariables.py myarg1 myarg2 myarg3

Once our job completes, the output is:

~/ $ cat SingleCoreVariablesJob.o4656
Started: 2012-10-16 18:51:31 Finished: 2012-10-16 18:51:41 Host: iln28 User: akrevl
My arguments:
['/afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.py', 'myarg1', 'myarg2', 'myarg3']

-V: using environment variables

What if we would like to pass arguments along with the qsub command? We can try a script like this:

   1 #!/bin/bash
   2 #PBS -N SingleCoreVariablesJob
   3 #PBS -l nodes=1:ppn=1
   4 #PBS -l walltime=00:01:00
   5 
   6 /usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCore/SingleCoreVariables.py $1 $2 $3

However if we try to run the qsub with our three arguments it will complain that we have supplied wrong arguments (actually it just prints out the usage information, but we should take the hint).

~/ $ qsub -V $HOME/tutorial/SingleCoreVariables/SingleCoreVariables.qsub.sh myarg1 myarg2 myarg3
usage: qsub [-a date_time] [-A account_string] [-b secs]
      [-c [ none | { enabled | periodic | shutdown |
      depth=<int> | dir=<path> | interval=<minutes>}... ]
...

Instead of passing arguments by the command line, we can pass them as environment variables. Let's export our arguments first:

   1 ~/ $ export qsubarg1="myarg1"
   2 ~/ $ export qsubarg2="myarg2"
   3 ~/ $ export qsubarg3="myarg3"

Now let's make the adjustments to the submission script. We need to use qsubargX ((you can download the script here SingleCoreVariables.qsub.sh)):

   1 #!/bin/bash
   2 #PBS -N SingleCoreVariablesJob
   3 #PBS -l nodes=1:ppn=1
   4 #PBS -l walltime=00:01:00
   5 
   6 /usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.py $qsubarg1 $qsubarg2 $qsubarg3

Let's try and submit this to the cluster (do not forget the -V switch):

qsub -V /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.qsub.sh

qsub made sure that all the environment variables were passed to the execution node and our program ran with the provided arguments.

~/ $ cat SingleCoreVariablesJob.o4657
Started: 2012-10-16 19:21:13 Finished: 2012-10-16 19:21:23 Host: iln28 User: akrevl
My arguments:
['/afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.py', 'myarg1', 'myarg2', 'myarg3']

-v: listing the variables

Another way of passing the arguments to qsub is to just list them as key=value pairs. Let's keep the submission script we used in the previous example (you can download the script here SingleCoreVariables.qsub.sh):

   1 #!/bin/bash
   2 #PBS -N SingleCoreVariablesJob
   3 #PBS -l nodes=1:ppn=1
   4 #PBS -l walltime=00:01:00
   5 
   6 /usr/bin/python2.7 /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.py $qsubarg1 $qsubarg2 $qsubarg3

We can list the argument values as part of the qsub command using the -v switch (we could actually omit the -V switch, but we just got used to it, so why bother):

qsub -V -v qsubarg1="myarg1",qsubarg2="myarg2",qsubarg3="myarg3" /afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.qsub.sh

qsub made sure that the specified arguments were available on the execution node and our program used them as you can see in the following listing:

~/ $ cat SingleCoreVariablesJob.o4658
Started: 2012-10-16 19:28:55 Finished: 2012-10-16 19:29:05 Host: iln28 User: akrevl
My arguments:
['/afs/cs.stanford.edu/u/akrevl/tutorial/SingleCoreVariables/SingleCoreVariables.py', 'myarg1', 'myarg2', 'myarg3']