Skip to content

Sending computing jobs

In order to send a job, it must be specified in form of .sub file, which defines parameters such as:

  • executable and arguments
  • data input file (inpfile)
  • output file (outfile)
  • standard output file descriptor (output)
  • standard error file descriptor (error)
  • number of parallel jobs (Queue)

When a .sub is prepared, you can submit it using condor_submit command, eg."

condor_submit zadanie.sub
Below you can find example of .sub file with short description.

Important

You should always set should_transfer_files to False. Computing nodes have no internal storage, so all data input and output should be transfered via shared filesystem /sfs/users

Example

job.sub:

Universe   = vanilla 
# You should avoid sending data if the size of input 
# or output files exceed 100 MB
should_transfer_files = FALSE
index       = $INT(ProcId,%02d) 

# your working directory
initialdir  = /sfs/users/somebody/

inpfile     = input.dat
outfile     = output.root

executable  = start.sh
arguments   = $(inpfile) $(outfile)
output      = logs/stdout_$(index).log
error       = logs/stderr_$(index).log
log         = logs/condor_$(index).log

# Nome for the job batch
JobBatchName    = "my_job_$(index)"

# Number of jobs in this batch
Queue 3

If you need to set up an environment before running a job, you can use a shell script wrapper. Example for case, when ROOT has to be setup:

#!/bin/bash

# Here you can setup needed environment, e.g. ROOT-a v6.24
source /opt/root/v6.24.00/bin/thisroot.sh

# Here you start your program
/sfs/users/hulk/someapp $@

Getting Job Status

$> condor_q

-- Submitter: froth.cs.wisc.edu : <128.105.73.44:33847> : froth.cs.wisc.edu
 ID      OWNER            SUBMITTED    CPU_USAGE ST PRI SIZE CMD
 125.0   jbasney         4/10 15:35   0+00:00:00 I  -10 1.2  hello.remote
 132.0   raman           4/11 16:57   0+00:00:00 R  0   1.4  hello

2 jobs; 1 idle, 1 running, 0 held

Removing a Jobs

In order to remove a job batch with ID 132.0, use condor_rm command:

$> condor_rm 132.0
Job 132.0 removed.