The submission script

We'll tackle this one the other way around. So let's create our submission script first. You can download the script here: JobArray.qsub.sh

   1 #!/bin/bash
   2 #PBS -N JobArray
   3 #PBS -l nodes=1:ppn=1
   4 #PBS -l walltime=00:01:00
   5 
   6 /usr/bin/python2.7 $HOME/tutorial/JobArray/JobArray.py $PBS_ARRAYID 

The only special thing here is that we'll be passing the array id (so the number of the job in the array) to our Python script.

The program

Again we are using the same simple Python script that sleeps for a while and outputs some time and the arguments that it was called with. You can download the script here: JobArray.py

   1 #!/usr/bin/python2.7
   2 
   3 import socket, datetime, time, getpass, sys
   4 
   5 arrayid = sys.argv[1]
   6 
   7 # We're using just using a simple list here but you can
   8 # easily imagine this getting read from a file or sth ...
   9 arguments = [
  10   [ "myarg1-0", "myarg2-0", "myarg3-0" ],
  11   [ "myarg1-1", "myarg2-1", "myarg3-1" ],
  12   [ "myarg1-2", "myarg2-2", "myarg3-2" ],
  13   [ "myarg1-3", "myarg2-3", "myarg3-3" ]
  14 ]
  15 
  16 start = datetime.datetime.now()
  17 hostname = socket.gethostname().split('.')[0]
  18 username = getpass.getuser()
  19 time.sleep(10)
  20 end = datetime.datetime.now()
  21 
  22 dfmt = "%Y-%m-%d %H:%M:%S"
  23 print "Started: %s Finished: %s Host: %s User: %s" % (start.strftime(dfmt), end.strftime(dfmt), hostname, username)
  24 print "My arguments:"
  25 print arguments [int(arrayid)]

The only twist is, that we are reading the actual arguments from the list provided in the script itself. This could be easily replaced by reading from a cvs file or some other, neater argument storage.

Submit the job

Nothing left to do but submit the job to the cluster with qsub:

qsub -V -t 0-3 $HOME/tutorial/JobArray/JobArray.qsub.sh

There is a few things to note about the -t argument. This argument specifies that we the job should be run as a job array. In addition to that it also specifies the array ids that our instabces will get. When we run the command above we'll get instances 0, 1, 2, 3 respectively. We could also specify those as a comma delimited list. The following command does the same thing as the previous one:

qsub -V -t 0,1,2,3 $HOME/tutorial/JobArray/JobArray.qsub.sh

We could also make up our own non-sequential ids:

qsub -V -t 111,211,311,411 $HOME/tutorial/JobArray/JobArray.qsub.sh

Anyhow, if our jobs ran successfully, we should be able to see the results in the output files. In our case:

~/ $ cat *.o*
Started: 2012-10-16 21:01:12 Finished: 2012-10-16 21:01:22 Host: iln28 User: akrevl
My arguments:
['myarg1-0', 'myarg2-0', 'myarg3-0']
Started: 2012-10-16 21:01:12 Finished: 2012-10-16 21:01:22 Host: iln28 User: akrevl
My arguments:
['myarg1-1', 'myarg2-1', 'myarg3-1']
Started: 2012-10-16 21:01:12 Finished: 2012-10-16 21:01:22 Host: iln28 User: akrevl
My arguments:
['myarg1-2', 'myarg2-2', 'myarg3-2']
Started: 2012-10-16 21:01:13 Finished: 2012-10-16 21:01:23 Host: iln28 User: akrevl
My arguments:
['myarg1-3', 'myarg2-3', 'myarg3-3']

So we successfully ran four instances of our script with 4 different sets of arguments. Of course this is only one way of doing things... but it seems to work...