Here is a hypothetical... you have one program that is not multi-threaded nor aware of multiple cores. You have to run that program about a thousand times with different input parameters and different input data. And luckily... the results of a single run are independent of all the other results. This HOWTO describes how one might run such a scenario on the [[InfolabClusterCompute|Infolab Compute Cluster]]. We presume that you know your ''qsub'' basics. If that is not the case, please see [[InfolabClusterComputeHowtoSingle]] and [[InfolabClusterComputeHowtoVariables]] first. <> = The submission script = We'll tackle this one the other way around. So let's create our submission script first. You can download the script here: [[attachment:JobArray.qsub.sh]] {{{#!highlight bash #!/bin/bash #PBS -N JobArray #PBS -l nodes=1:ppn=1 #PBS -l walltime=00:01:00 /usr/bin/python2.7 $HOME/tutorial/JobArray/JobArray.py $PBS_ARRAYID }}} The only special thing here is that we'll be passing the array id (so the number of the job in the array) to our Python script. = The program = Again we are using the same simple Python script that sleeps for a while and outputs some time and the arguments that it was called with. You can download the script here: [[attachment:JobArray.py]] {{{#!highlight python #!/usr/bin/python2.7 import socket, datetime, time, getpass, sys arrayid = sys.argv[1] # We're using just using a simple list here but you can # easily imagine this getting read from a file or sth ... arguments = [ [ "myarg1-0", "myarg2-0", "myarg3-0" ], [ "myarg1-1", "myarg2-1", "myarg3-1" ], [ "myarg1-2", "myarg2-2", "myarg3-2" ], [ "myarg1-3", "myarg2-3", "myarg3-3" ] ] start = datetime.datetime.now() hostname = socket.gethostname().split('.')[0] username = getpass.getuser() time.sleep(10) end = datetime.datetime.now() dfmt = "%Y-%m-%d %H:%M:%S" print "Started: %s Finished: %s Host: %s User: %s" % (start.strftime(dfmt), end.strftime(dfmt), hostname, username) print "My arguments:" print arguments [int(arrayid)] }}} The only twist is, that we are reading the actual arguments from the list provided in the script itself. This could be easily replaced by reading from a cvs file or some other, neater argument storage. = Submit the job = Nothing left to do but submit the job to the cluster with ''qsub'': {{{ qsub -V -t 0-3 $HOME/tutorial/JobArray/JobArray.qsub.sh }}} There is a few things to note about the ''-t'' argument. This argument specifies that we the job should be run as a job array. In addition to that it also specifies the array ids that our instabces will get. When we run the command above we'll get instances 0, 1, 2, 3 respectively. We could also specify those as a comma delimited list. The following command does the same thing as the previous one: {{{ qsub -V -t 0,1,2,3 $HOME/tutorial/JobArray/JobArray.qsub.sh }}} We could also make up our own non-sequential ids: {{{ qsub -V -t 111,211,311,411 $HOME/tutorial/JobArray/JobArray.qsub.sh }}} Anyhow, if our jobs ran successfully, we should be able to see the results in the output files. In our case: {{{ ~/ $ cat *.o* Started: 2012-10-16 21:01:12 Finished: 2012-10-16 21:01:22 Host: iln28 User: akrevl My arguments: ['myarg1-0', 'myarg2-0', 'myarg3-0'] Started: 2012-10-16 21:01:12 Finished: 2012-10-16 21:01:22 Host: iln28 User: akrevl My arguments: ['myarg1-1', 'myarg2-1', 'myarg3-1'] Started: 2012-10-16 21:01:12 Finished: 2012-10-16 21:01:22 Host: iln28 User: akrevl My arguments: ['myarg1-2', 'myarg2-2', 'myarg3-2'] Started: 2012-10-16 21:01:13 Finished: 2012-10-16 21:01:23 Host: iln28 User: akrevl My arguments: ['myarg1-3', 'myarg2-3', 'myarg3-3'] }}} So we successfully ran four instances of our script with 4 different sets of arguments. Of course this is only one way of doing things... but it seems to work...