Sending computing jobs
In order to send a job, it must be specified in form of .sub
file, which defines parameters such as:
- executable and arguments
- data input file (inpfile)
- output file (outfile)
- standard output file descriptor (output)
- standard error file descriptor (error)
- number of parallel jobs (Queue)
When a .sub
is prepared, you can submit it using condor_submit
command, eg."
condor_submit zadanie.sub
.sub
file with short description.
Important
You should always set should_transfer_files
to False.
Computing nodes have no internal storage, so all data input and output should be transfered via shared filesystem /sfs/users
Example
job.sub:
Universe = vanilla
# You should avoid sending data if the size of input
# or output files exceed 100 MB
should_transfer_files = FALSE
index = $INT(ProcId,%02d)
# your working directory
initialdir = /sfs/users/somebody/
inpfile = input.dat
outfile = output.root
executable = start.sh
arguments = $(inpfile) $(outfile)
output = logs/stdout_$(index).log
error = logs/stderr_$(index).log
log = logs/condor_$(index).log
# Nome for the job batch
JobBatchName = "my_job_$(index)"
# Number of jobs in this batch
Queue 3
If you need to set up an environment before running a job, you can use a shell script wrapper. Example for case, when ROOT has to be setup:
#!/bin/bash
# Here you can setup needed environment, e.g. ROOT-a v6.24
source /opt/root/v6.24.00/bin/thisroot.sh
# Here you start your program
/sfs/users/hulk/someapp $@
Getting Job Status
$> condor_q
-- Submitter: froth.cs.wisc.edu : <128.105.73.44:33847> : froth.cs.wisc.edu
ID OWNER SUBMITTED CPU_USAGE ST PRI SIZE CMD
125.0 jbasney 4/10 15:35 0+00:00:00 I -10 1.2 hello.remote
132.0 raman 4/11 16:57 0+00:00:00 R 0 1.4 hello
2 jobs; 1 idle, 1 running, 0 held
Removing a Jobs
In order to remove a job batch with ID 132.0, use condor_rm
command:
$> condor_rm 132.0
Job 132.0 removed.