Skip to content

Commit

Permalink
(#157)
Browse files Browse the repository at this point in the history
Added a data movement example
  • Loading branch information
henricasanova committed Apr 21, 2020
1 parent 945c9f8 commit d0f4b4b
Show file tree
Hide file tree
Showing 40 changed files with 514 additions and 61 deletions.
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -368,7 +368,7 @@ set(TEST_FILES
test/pilot_job/CriticalPathSchedulerTest.cpp
test/misc/PointerUtilTest.cpp
test/misc/BogusMessageTest.cpp
examples/simple-example/scheduler/pilot_job/CriticalPathPilotJobScheduler.cpp
examples/real-workflow-example/scheduler/pilot_job/CriticalPathPilotJobScheduler.cpp
test/compute_services/ScratchSpaceTest.cpp
test/energy_consumption/EnergyConsumptionTest.cpp
test/simulation/simulation_output/SimulationDumpJSONTest.cpp
Expand Down
3 changes: 2 additions & 1 deletion conf/cmake/Examples.cmake
Original file line number Diff line number Diff line change
@@ -1,14 +1,15 @@
# list of examples
set(EXAMPLES_CMAKEFILES_TXT
examples/simple-example/CMakeLists.txt
examples/basic-examples/bare-metal-chain/CMakeLists.txt
examples/basic-examples/bare-metal-chain-scratch/CMakeLists.txt
examples/basic-examples/bare-metal-bag-of-tasks/CMakeLists.txt
examples/basic-examples/bare-metal-complex-job/CMakeLists.txt
examples/basic-examples/bare-metal-data-movement/CMakeLists.txt
examples/basic-examples/cloud-bag-of-tasks/CMakeLists.txt
examples/basic-examples/virtualized-cluster-bag-of-tasks/CMakeLists.txt
examples/basic-examples/batch-bag-of-tasks/CMakeLists.txt
examples/basic-examples/batch-pilot-job/CMakeLists.txt
examples/real-workflow-example/CMakeLists.txt
)

foreach (cmakefile ${EXAMPLES_CMAKEFILES_TXT})
Expand Down
34 changes: 26 additions & 8 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,36 +10,42 @@ Below are high-level descriptions of the example in each sub-directory.
Details on the specifics of each simulator are in extensive source code
comments.

---

### Basic simulator examples

#### Simulators that showcase fundamental functionality (using the bare-metal compute service)

- ```bare-metal-chain```: A simulation of the execution of a
- ```basic-examples/bare-metal-chain```: A simulation of the execution of a
chain workflow by a Workflow Management System on a bare-metal compute service,
with all workflow data being read/written from/to a single storage
service. The compute service runs on a 10-core host, and each task is
executed as a single job that uses 10 cores

- ```bare-metal-chain-scratch```: Similar, but the compute service now
- ```basic-examples/bare-metal-chain-scratch```: Similar, but the compute service now
has scratch space to hold intermediate workflow files. Since files
created in the scratch space during a job's execution are erased after
that job's completion. As a result, the workflow is executed as a single
multi-task job.

- ```bare-metal-bag-of-tasks```: A simulation of the execution of a
- ```basic-examples/bare-metal-bag-of-tasks```: A simulation of the execution of a
bag-of-task workflow by a Workflow Management System on a bare-metal compute
service, with all workflow data being read/written from/to a single
storage service. Up to two workflow tasks are executed concurrently on
the compute service, in which case one task is executed on 6 cores and
the other on 4 cores.

- ```bare-metal-complex-job```: A simulation of the execution of a
- ```basic-examples/bare-metal-complex-job```: A simulation of the execution of a
one-task workflow on a compute service as a job that includes not only
the task computation but also data movements.

- ```basic-examples/bare-metal-data-movement```: A simulation that is similar
to the previous one, but instead of using a complex job to do data movements
and deletions, the WMS does it "by hang".

#### Simulators that showcase the use of the cloud compute service

- ```cloud-bag-of-tasks```: A simulation of the execution of a
- ```basic-examples/cloud-bag-of-tasks```: A simulation of the execution of a
bag-of-task workflow by a Workflow Management System on a cloud compute
service, with all workflow data being read/written from/to a single
storage service. Up to two workflow tasks are executed concurrently on
Expand All @@ -48,14 +54,26 @@ Details on the specifics of each simulator are in extensive source code

#### Simulators that showcase the use of the batch compute service

- ```batch-bag-of-tasks```: A simulation of the execution of a
- ```basic-examples/batch-bag-of-tasks```: A simulation of the execution of a
bag-of-task workflow by a Workflow Management System on a batch compute
service, with all workflow data being read/written from/to a single
storage service. Up to two workflow tasks are executed concurrently on
the compute service on two compute nodes, each of them using 10 cores.
**This example also features low-level workflow execution handling, and
dealing with job failures**.

- ```batch-pilot-jobs```: A simulation that showcases the use of
"pilot jobs" on a batch compute service.
- ```basic-examples/batch-pilot-jobs```: A simulation that showcases the use of
"pilot jobs" on a batch compute service. In this example, due to a pilot job
expiration, the workflow does not complete successfully.

---

### Examples with real workflows and more sophisticated WMS implementations

- ```real-workflow-example```: Two simulators, one in which the workflow is executed
using a batch compute service, and another in which the workflow is executed
using a cloud compute service. These simulators take as input workflow description
files from real-world workflow applications. They use the scheduler abstraction
provided by WREMNCH to implement complex Workflow Management System.

---
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ int main(int argc, char **argv) {

/* Add workflow tasks and files */
for (int i=0; i < num_tasks; i++) {
/* Create a task: 10GFlop, 1 to 10 cores, 0.90 parallel efficiency, 10MB memory footprint */
/* Create a task: random GFlop, 1 to 10 cores, 0.90 parallel efficiency, 10MB memory footprint */
auto task = workflow.addTask("task_" + std::to_string(i), dist(rng), 1, 10, 0.90, 10000000);
task->addInputFile(workflow.addFile("input_" + std::to_string(i), 10000000));
task->addOutputFile(workflow.addFile("output_" + std::to_string(i), 10000000));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
*/


#ifndef WRENCH_TWO_TASKS_AT_A_TIME_H
#define WRENCH_TWO_TASKS_AT_A_TIME_H
#ifndef WRENCH_EXAMPLE_TWO_TASKS_AT_A_TIME_H
#define WRENCH_EXCAMPLE_TWO_TASKS_AT_A_TIME_H

#include <wrench-dev.h>

Expand Down Expand Up @@ -42,4 +42,4 @@ namespace wrench {

};
}
#endif //WRENCH_TWO_TASKS_AT_A_TIME_H
#endif //WRENCH_EXAMPLE_TWO_TASKS_AT_A_TIME_H
6 changes: 3 additions & 3 deletions examples/basic-examples/bare-metal-chain/OneTaskAtATimeWMS.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
*/


#ifndef WRENCH_ONE_TASK_AT_A_TIME_H
#define WRENCH_ONE_TASK_AT_A_TIME_H
#ifndef WRENCH_EXAMPLE_ONE_TASK_AT_A_TIME_H
#define WRENCH_EXAMPLE_ONE_TASK_AT_A_TIME_H

#include <wrench-dev.h>

Expand Down Expand Up @@ -42,4 +42,4 @@ namespace wrench {

};
}
#endif //WRENCH_ONE_TASK_AT_A_TIME_H
#endif //WRENCH_EXAMPLE_ONE_TASK_AT_A_TIME_H
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
*/


#ifndef WRENCH_COMPLEX_JOB_H
#define WRENCH_COMPLEX_JOB_H
#ifndef WRENCH_EXAMPLE_COMPLEX_JOB_H
#define WRENCH_EXAMPLE_COMPLEX_JOB_H

#include <wrench-dev.h>

Expand Down Expand Up @@ -42,4 +42,4 @@ namespace wrench {

};
}
#endif //WRENCH_COMPLEX_JOB_H
#endif //WRENCH_EXAMPLE_COMPLEX_JOB_H
12 changes: 12 additions & 0 deletions examples/basic-examples/bare-metal-data-movement/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
set(SOURCE_FILES
DataMovementWMS.h
DataMovementWMS.cpp
DataMovement.cpp
)

add_executable(wrench-bare-metal-data-movement-simulator ${SOURCE_FILES})

target_link_libraries(wrench-bare-metal-data-movement-simulator wrench ${SimGrid_LIBRARY} ${PUGIXML_LIBRARY} ${LEMON_LIBRARY})

install(TARGETS wrench-bare-metal-data-movement-simulator DESTINATION bin)

146 changes: 146 additions & 0 deletions examples/basic-examples/bare-metal-data-movement/DataMovement.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
/**
* Copyright (c) 2017-2018. The WRENCH Team.
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*/

/**
** This simulator simulates the execution of a one-task workflow, where the task is:
**
** {InFile #0, InFile #1} -> Task -> {OutFile #0, OutFile #1}
**
** The compute platform comprises four hosts, WMSHost, StorageHost1, StorageHost2, and ComputeHost.
** On WMSHost runs a WMS (defined in class DataMovementWMS). On ComputeHost runs a bare metal
** compute service, that has access to the 10 cores of that host. On StorageHost1 and
** StorageHost2 run two storage services. Once the simulation is done,
** the completion time of each workflow task is printed.
**
** Example invocation of the simulator with only WMS logging:
** ./bare-metal-data-movement-simulator ./four_hosts.xml --wrench-no-logs --log=custom_wms.threshold=info
**
** Example invocation of the simulator for a 5-task workflow with full logging:
** ./bare-metal-data-movement-simulator ./four_hosts.xml
**/


#include <iostream>
#include <wrench.h>

#include "DataMovementWMS.h" // WMS implementation

/**
* @brief The Simulator's main function
*
* @param argc: argument count
* @param argv: argument array
* @return 0 on success, non-zero otherwise
*/
int main(int argc, char **argv) {

/*
* Declare a WRENCH simulation object
*/
wrench::Simulation simulation;

/* Initialize the simulation, which may entail extracting WRENCH-specific and
* Simgrid-specific command-line arguments that can modify general simulation behavior.
* Two special command-line arguments are --help-wrench and --help-simgrid, which print
* details about available command-line arguments. */
simulation.init(&argc, argv);

/* Parsing of the command-line arguments for this WRENCH simulation */
if (argc != 2) {
std::cerr << "Usage: " << argv[0] << " <xml platform file> [--wrench-no-logs --log=custom_wms.threshold=info]" << std::endl;
exit(1);
}

/* Reading and parsing the platform description file, written in XML following the SimGrid-defined DTD,
* to instantiate the simulated platform */
std::cerr << "Instantiating simulated platform..." << std::endl;
simulation.instantiatePlatform(argv[1]);

/* Declare a workflow */
wrench::Workflow workflow;

/* Add the workflow task an files */
auto task = workflow.addTask("task", 10000000000.0, 1, 10, 0.90, 10000000);
task->addInputFile(workflow.addFile("infile_1", 100000000));
task->addInputFile(workflow.addFile("infile_2", 10000000));
task->addOutputFile(workflow.addFile("outfile_1", 200000000));
task->addOutputFile(workflow.addFile("outfile_2", 5000000));

/* Instantiate a storage service, and add it to the simulation.
* A wrench::StorageService is an abstraction of a service on
* which files can be written and read. This particular storage service, which is an instance
* of wrench::SimpleStorageService, is started on StorageHost1 in the
* platform , which has an attached disk mounted at "/". The SimpleStorageService
* is a basic storage service implementation provided by WRENCH.
* Throughout the simulation execution, input/output files of workflow tasks will be located
* in this storage service, and accessed remotely by the compute service. Note that the
* storage service is configured to use a buffer size of 50M when transferring data over
* the network (i.e., to pipeline disk reads/writes and network revs/sends). */
std::cerr << "Instantiating a SimpleStorageService on StorageHost1..." << std::endl;
auto storage_service1 = simulation.add(new wrench::SimpleStorageService(
"StorageHost1", {"/"}, {{wrench::SimpleStorageServiceProperty::BUFFER_SIZE, "50000000"}}, {}));

/* Instantiate another one on StorageHost2 */
std::cerr << "Instantiating a SimpleStorageService on StorageHost2..." << std::endl;
auto storage_service2 = simulation.add(new wrench::SimpleStorageService(
"StorageHost2", {"/"}, {{wrench::SimpleStorageServiceProperty::BUFFER_SIZE, "50000000"}}, {}));

/* Instantiate a bare-metal compute service, and add it to the simulation.
* A wrench::BareMetalComputeService is an abstraction of a compute service that corresponds
* to a software infrastructure that can execute tasks on hardware resources.
* This particular service is started on ComputeHost and has no scratch storage space (mount point argument = "").
* This means that tasks running on this service will access data only from remote storage services. */
std::cerr << "Instantiating a BareMetalComputeService on ComputeHost..." << std::endl;
auto baremetal_service = simulation.add(new wrench::BareMetalComputeService(
"ComputeHost", {"ComputeHost"}, "", {}, {}));

/* Instantiate a WMS, to be stated on WMSHost, which is responsible
* for executing the workflow. */

auto wms = simulation.add(
new wrench::DataMovementWMS({baremetal_service}, {storage_service1, storage_service2}, "WMSHost"));

/* Associate the workflow to the WMS */
wms->addWorkflow(&workflow);

/* Instantiate a file registry service to be started on WMSHost. This service is
* essentially a replica catalog that stores <file , storage service> pairs so that
* any service, in particular a WMS, can discover where workflow files are stored. */
std::cerr << "Instantiating a FileRegistryService on WMSHost ..." << std::endl;
auto file_registry_service = new wrench::FileRegistryService("WMSHost");
simulation.add(file_registry_service);

/* It is necessary to store, or "stage", input files that only input. The getInputFiles()
* method of the Workflow class returns the set of all workflow files that are not generated
* by workflow tasks, and thus are only input files. These files are then staged on the storage service. */
std::cerr << "Staging task input files..." << std::endl;
for (auto const &f : workflow.getInputFiles()) {
simulation.stageFile(f, storage_service1);
}

/* Launch the simulation. This call only returns when the simulation is complete. */
std::cerr << "Launching the Simulation..." << std::endl;
try {
simulation.launch();
} catch (std::runtime_error &e) {
std::cerr << "Exception: " << e.what() << std::endl;
return 1;
}
std::cerr << "Simulation done!" << std::endl;

/* Simulation results can be examined via simulation.output, which provides access to traces
* of events. In the code below, we print the retrieve the trace of all task completion events, print how
* many such events there are, and print some information for the first such event. */
auto trace = simulation.getOutput().getTrace<wrench::SimulationTimestampTaskCompletion>();
for (auto const &item : trace) {
std::cerr << "Task " << item->getContent()->getTask()->getID() << " completed at time " << item->getDate() << std::endl;
}

return 0;
}
Loading

0 comments on commit d0f4b4b

Please sign in to comment.