Skip to content

Commit

Permalink
(#220) more documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
henricasanova committed Aug 15, 2021
1 parent dc35bea commit e75ff00
Show file tree
Hide file tree
Showing 2 changed files with 71 additions and 32 deletions.
25 changes: 4 additions & 21 deletions doc/service_guide_101/htcondor.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,7 @@ HTCondor is composed of six main service daemons (`startd`, `starter`,
`schedd`, `shadow`, `negotiator`, and `collector`). In addition,
each host on which one or more of these daemons is spawned must also
run a `master` daemon, which controls the execution of all other
daemons (including initialization and completion).

daemons (including initialization and completion).

# Creating an HTCondor Service # {#guide-htcondor-creating}

Expand Down Expand Up @@ -58,23 +57,7 @@ auto htcondor_compute_service = simulation->add(
));
~~~~~~~~~~~~~

Jobs submitted to the `wrench::HTCondorComputeService` instance will be dispatched to one
of the 'child' compute services available to that instance (only one in the above example).


# Anatomy of the HTCondor Service # {#guide-htcondor-anatomy}

In WRENCH, we implement the 3 fundamental HTCondor services, implemented
as particular sets of daemons. The _Job Execution Service_ consists of a
`startd` daemon, which adds the host on which it is running to the HTCondor
pool, and of a `starter` daemon, which manages task executions on this host.
The _Central Manager Service_ consists of a `collector` daemon, which collects
information about all other daemons, and of a `negotiator` daemon, which
performs task/resource matchmaking. The _Job Submission Service_ consists
of a `schedd` daemon, which maintains a queue of tasks, and of several
instances of a `shadow` daemon, each of which corresponds to a task submitted
to the Condor pool for execution.

![](images/htcondor-architecture.png)

Jobs submitted to the `wrench::HTCondorComputeService` instance will be
dispatched automatically to one of the 'child' compute services available
to that instance (only one in the above example).

78 changes: 67 additions & 11 deletions doc/service_guide_102/htcondor.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,74 @@
Interacting with a HTCondor compute service {#guide-102-htcondor}
============

**This section of the documentation is a work in progress**

WRENCH HTCondor service implementation spawns two additional services during
execution: wrench::HTCondorCentralManagerService and wrench::HTCondorNegotiatorService.
A `wrench::HTCondorComputeService` instance is essential a front-end to several
"child" compute services. As such one can submit jobs to it, just like one would do to
other compute services, and it may "decide" to which service these jobs
will be delegated. In fact, a WMS can even add new child compute services to be used
by HTCondor dynamically. Which child service is used is dictated/influenced
by service-specific arguments passed to the `wrench::JobManager::submitJob()` method.

The wrench::HTCondorCentralManagerService coordinates the execution of jobs
submitted to the HTCondor pool. Jobs submitted to the wrench::HTCondorComputeService
are then queued in a `std::vector<wrench::StandardJob *>`, which are then
consumed as resources become available. The Central Manager also spawns the
execution of the wrench::HTCondorNegotiatorService, which performs matchmaking
between jobs and compute resources available in the pool. Note that job submission
in HTCondor is asynchronous, thus our simulated services operates independently
of each other

Rather than going into a long-winded explanation, the examples code-fragment below
show-case the creation of a `wrench::HTCondorComputeService` instance
and its use by a WMS. Let's start with the creation (in main). Note that
arguments to service constructors are ommitted for brevity (see the WMS
implementation in `examples/condor-grid-example/CondorWMS.cp` for a
complete and working example).


~~~~~~~~~~~~~{.cpp}
// Create a BareMetalComputeService instance
auto baremetal_cs = simulation->add(new wrench::BareMetalComputeService(...));
// Create two BatchComputeService instances
auto batch1_cs = simulation->add(new wrench::BatchComputeService(...));
auto batch2_cs = simulation->add(new wrench::BatchComputeService(...));
// Create a HTCondorComputeService instance with the above
// three services as "child" services
auto htcondor_cs = simulation->add(
new wrench::HTCondorComputeService("some_host", {baremetal_cs, batch1_cs, batch2_cs}, "/scratch");
// Create a CloudComputeService instance
auto cloud_cs = simulation->add(new wrench::CloudComputeService(...));
~~~~~~~~~~~~~

Let's now say that a WMS was created that has access to all 5 above services, but will choose to submit
all jobs via HTCondor. The first thing to do, so as to make the use of the cloud service possible,
is to create a few VM instances and add them as child services to the HTCondor service:

~~~~~~~~~~~~~{.cpp}
// Create and start to VMs on the cloud service
auto vm1 = cloud_cs->createVM(...);
auto vm2 = cloud_cs->createVM(...);
auto vm1_cs = cloud_cs->startVM(vm1);
auto vm2_cs = cloud_cs->startVM(vm1);
// Add the two VM's bare-metal compute services to HTCondor
htcondor_cs->addComputeService(vm1_cs);
htcondor_cs->addComputeService(vm2_cs);
~~~~~~~~~~~~~

So, at this point, HTCondor has access to 3 bare-metal compute services (2 of which are running inside VMs),
and 2 batch compute services.




XXXXXXXXXXXXX END BARF




# Anatomy of the HTCondor Service # {#guide-htcondor-anatomy}

The in-simulation implementation of HTCondor in WRENCH is simplified in
terms of its functionality and design when compare to the actual
implementation of HTCondor. The `wrench::HTCondorComputeService` spawns two
additional services during execution,
`wrench::HTCondorCentralManagerService` and
`wrench::HTCondorNegotiatorService`, both of which loosely correspond to
actual HTCondor daemons (`collector`, `negotiator`, `schedd`).

0 comments on commit e75ff00

Please sign in to comment.