-
Notifications
You must be signed in to change notification settings - Fork 175
JobManagement
In order to submit a job to DIRAC you need to describe what your job requirements. There are several requirements you can specify, for instance, the name of the executable program you want to execute, the location of the output of this executable, etc.
For describing your job requirements you will use a job description language (JDL) specific to DIRAC. In this page you will find example of jobs and how you can use DIRAC to submit them for execution, how to track their status and how to retrieve their results.
The following is the description of a simple job which only purpose is to list the contents of the working directory. In your favorite text editor, create a file and name it Simple.jdl
with the contents below:
JobName = "Simple_Job"; Executable = "/bin/ls"; Arguments = "-ltr"; StdOutput = "StdOut"; StdError = "StdErr"; OutputSandbox = {"StdOut","StdErr"};
This is a description (using JDL syntax) of a simple job. In it, you ask DIRAC to select an execution site (among the ones it can use for executing your jobs) and to execute the file /bin/ls
with arguments -ltr
. The results of this execution (i.e. the standard output and standard error of the job) will be recorded in the files StdOut
and StdErr
respectively.
To actually submit the job to DIRAC do:
$ dirac-wms-job-submit Simple.jdl JobID = 123
Note that in this example we are assuming that the executable /bin/ls
is already present in the machine where DIRAC will execute your job. This is indeed the case because we are using a system command which is present in all Unix machines.
If the submission is successful, DIRAC prints the identifier it has assigned to your job (123
in this example). You will use this identifier for retrieving the status of your job and the job results when its execution is finished. More details on this below.
As with any other DIRAC command, you can use the option --help
to get more information on the flags accepted by that command, for instance:
$ dirac-wms-job-submit --help Submit jobs to DIRAC WMS Usage: dirac-wms-job-submit [option|cfgfile] ... JDL ... Arguments: JDL: Path to JDL file General options: -o: --option= : Option=value to add -s: --section= : Set base section for relative parsed options -c: --cert= : Use server certificate to connect to Core Services -d --debug : Set debug mode (-dd is extra debug) -h --help : Shows this help
For an exhaustive list of the JDL syntax and examples of usage please refer to DIRAC's Job Description Language Reference.
Once your job is successfully submitted, DIRAC will queue it up for execution. Eventually, it will be sent to the execution site where it will be launched. To know the current status of your job, use the command:
$ dirac-wms-job-status 123 JobID=123 Status=Waiting; MinorStatus=Pilot Agent Submission; Site=ANY;
Here 123
is the job identifier. This number is displayed when you have successfully submitted your job with the command dirac-wms-job-submit
. The Status=Waiting
means that your job is queued waiting to be put in execution in any of the sites DIRAC is configured with (Site=ANY
).
Once your job has finished its execution, you can retrieve its output, that is, the set of files produced as a result of the execution of your job. For this purpose, use the command:
$ dirac-wms-job-get-output 123 Job output sandbox retrieved in /current/working/directory/123/
DIRAC will create a directory in your current working directory named after the job identifier (123
in this case) and put all the relevant files inside. You can also use the --Dir
option of this command to tell DIRAC where you want your job's output to be stored. See dirac-wms-job-get-output --help
for details.
In order to know when your job has finished, so that you can retrieve its results, you can use the command:
$ dirac-wms-job-status 123 JobID=123 Status=Done; MinorStatus=Execution Complete; Site=EGI.KEK.jp;
When this command displays Status=Done
the job has completed its execution and its output is ready for retrieval.
In most situations, you will want to execute your own programs which are located in your computer. You can ask DIRAC to transfer those executable files together with your job to the execution site. You can do this by specifying what is called a sandbox: a container of files you want to be transferred to or from the site where you job actually executes. The input sandbox is the set of files you want DIRAC to transfer to the execution site and the output sandbox the files you want to retrieve when your job has finished its execution (typically, those files are produced by your job).
This example shows how to submit a job for executing your own executable file (a shell script in this particular example).
-
Create the job description in file
InputAndOuputSandbox.jdl
with contents as shown below:JobName = "InputAndOuputSandbox"; Executable = "testJob.sh"; StdOutput = "StdOut"; StdError = "StdErr"; InputSandbox = {"testJob.sh"}; OutputSandbox = {"StdOut","StdErr"};
Using the attribute InputSandbox
we ask DIRAC to transfer the local file named testJob.sh
up to the site where this job will be launched for execution. testJob.sh
is a file in your computer that must exist so that we can submit the job.
-
Create the file
testJob.sh
with contents:#!/bin/sh /bin/hostname /bin/date /bin/ls -la
It is strongly recommended you test locally in your computer that the executable you are going to submit to DIRAC actually works.
-
Now you can submit this job with the command:
$ dirac-wms-job-submit InputAndOuputSandbox.jdl
In case where the data, programs, etc are stored in a Grid Storage Element, it can be specified as part of InputSandbox or InputData. InputSandbox can be declared as a list, separated by commas with each file between "".
Before the grid file can be used, it should be uploaded first to the Grid. This is done using the following command:
dirac-dms-add-file <LFN> <local_file> SE
For example:
bash-4.1$ dirac-dms-add-file /vo.france-asia.org/user/v/vhamar/TestPekin-1.txt Test-Pekin.txt CPPM-disk -o LogLevel=INFO Uploading /vo.france-asia.org/user/v/vhamar/TestPekin-1.txt putAndRegister: Checksum information not provided. Calculating adler32. putAndRegister: Checksum calculated to be e4be4920. __putFile: Executing transfer of file:Test-Pekin.txt to srm://marsedpm.in2p3.fr:8446/srm/managerv2?SFN= /dpm/in2p3.fr/home/vo.france-asia.org/user/v/vhamar/TestPekin-1.txt using 1 streams __putFile: Successfully put file to storage. putAndRegister: Sending accounting took 1.2 seconds Successfully uploaded file to CPPM-disk
-
Use the same testJob.sh shell script as in the previous exercise.
-
In the JDL we have to add OutputSE and OutputData:
JobName = "LFNInputSandbox"; Executable = "testJob.sh"; StdOutput = "StdOut"; StdError = "StdErr"; InputSandbox = {"testJob.sh","LFN:/vo.france-asia.org/user/v/vhamar/test.txt"}; OutputSandbox = {"StdOut","StdErr"}; OutputSE = "CPPM-disk"; OutputData = {"StdOut"};
-
After creation of JDL file the next step is submit a job, using the command:
dirac-wms-job-submit <JDL>
The same effect can be achieved with the following JDL LFNInputData.jdl:
JobName = "LFNInputData"; Executable = "testJob.sh"; StdOutput = "StdOut"; StdError = "StdErr"; InputSandbox = {"testJob.sh"}; InputData = {"LFN:/vo.france-asia.org/user/v/vhamar/test.txt"}; OutputSandbox = {"StdOut","StdErr"}; OutputSE = "CPPM-disk"; OutputData = {"StdOut"};
An important difference of specifying input data as InputSandbox or InputData is that in the first case the data file is always downloaded local to the job running in the Grid. In the InputData case, the file can be either downloaded locally or accessed remotely using some remote acces protocol, e.g. rfio or dcap, depending on the policies adopted by your Virtual Organization.
-
After creating the JDL file the next step is to submit a job using the command:
dirac-wms-job-submit <JDL>
For example:
bash-3.2$ dirac-wms-job-submit Simple.jdl -o LogLevel=INFO 2010-10-17 15:34:36 UTC dirac-wms-job-submit.py/DiracAPI INFO: <=====DIRAC v5r10-pre2=====> 2010-10-17 15:34:36 UTC dirac-wms-job-submit.py/DiracAPI INFO: Will submit job to WMS JobID = 11
In the output of the command you get the DIRAC job ID which is a unique job identifier. You will use it later for other job operations.
-
The next step is to monitor the job status using the command:
dirac-wms-job-status <Job_ID> bash-3.2$ dirac-wms-job-status 11 JobID=11 Status=Waiting; MinorStatus=Pilot Agent Submission; Site=ANY;
-
And finally, after the job achieves status Done, you can retrieve the job Output Sandbox:
dirac-wms-job-get-output [--dir output_directory] <Job_ID>
- To continue with COMDIRAC commands, follow this link:
https://github.com/DIRACGrid/COMDIRAC/wiki/Job-Management