Skip to content

Commit

Permalink
(#220) Added more documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
henricasanova committed Aug 15, 2021
1 parent e75ff00 commit 769132a
Showing 1 changed file with 40 additions and 3 deletions.
43 changes: 40 additions & 3 deletions doc/service_guide_102/htcondor.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ by service-specific arguments passed to the `wrench::JobManager::submitJob()` me
Rather than going into a long-winded explanation, the examples code-fragment below
show-case the creation of a `wrench::HTCondorComputeService` instance
and its use by a WMS. Let's start with the creation (in main). Note that
arguments to service constructors are ommitted for brevity (see the WMS
arguments to service constructors are omitted for brevity (see the WMS
implementation in `examples/condor-grid-example/CondorWMS.cp` for a
complete and working example).

Expand Down Expand Up @@ -54,18 +54,55 @@ htcondor_cs->addComputeService(vm2_cs);
So, at this point, HTCondor has access to 3 bare-metal compute services (2 of which are running inside VMs),
and 2 batch compute services.

Let's consider that the WMS will submit `wrench::StandardJob` instances to HTCondor. These jobs can be
of two kinds or, in HTCondor parlance, belong to one of two universes: **grid** jobs and **non-grid** jobs.
By default a job is considered to be in the non-grid universe. But if the service-specific arguments
passed to `wrench::JobManager::submitJob()` include a "universe":"grid" key:value pair, then the submitted job
is in the grid universe. HTCondor handles both kinds of jobs differently:

- Non-grid universe jobs are queued and dispatched by HTCondor whenever
possible to idle resources managed by one of the child bare-metal
services. HTCondor chooses the service to use.

- Grid universe jobs are dispatched by HTCondor immediately to
a specific child batch compute service. As a result, these jobs
must be submitted with service-specific arguments that provide values
for "-N", "-c", and "-t" keys (like for any job submitted to a batch
compute service), as well as a "-service" key that specifies the name
of the batch service that should run the job (this argument is optional
if there is a single child batch compute service).

XXXXXXXXXXXXX END BARF
In the example below, we show both kinds of job submissions:

~~~~~~~~~~~~~{.cpp}
// Create a non-grid universe standard job and submit it to HTCondor,
// which will run it on one of its 3 child bare-metal compute services
auto ng_job = job_manager->createStandardJob(...);
job_manager->submitJob(ng_job, htcondor_cs, {}); // no service-specific arguments
// Create a grid universe standard job and submit it to HTCondor,
// which will run it on a specific child batch compute service.
auto n_job = job_manager->createStandardJob(...);
std::map<std::string, std::string> service_specific_args;
service_specific_args["-N"] = "2"; // 2 compute nodes
service_specific_args["-c"] = "4"; // 4 cores per compute nodes
service_specific_args["-t"] = "60"; // runs for one hour
service_specific_args["universe"] = "grid"; // Grid universe
service_specific_args["-service"] = batch1_cs->getName(); // Run on the first batch compute service
job_manager->submitJob(n_job, htcondor_cs, service_specific_args); // no service-specific arguments
~~~~~~~~~~~~~

The above covers the essentials. See the code in the `examples/condor-grid-example/` directory
for working/usable code.



# Anatomy of the HTCondor Service # {#guide-htcondor-anatomy}

The in-simulation implementation of HTCondor in WRENCH is simplified in
terms of its functionality and design when compare to the actual
terms of its functionality and design when compared to the actual
implementation of HTCondor. The `wrench::HTCondorComputeService` spawns two
additional services during execution,
`wrench::HTCondorCentralManagerService` and
Expand Down

0 comments on commit 769132a

Please sign in to comment.