From 428454a0a10ba7b366ad0b92c9ddf1f06b6a520f Mon Sep 17 00:00:00 2001 From: Justen Di Ruscio Date: Mon, 25 Jul 2022 20:17:40 -0300 Subject: [PATCH] royal viking battles against my auto-formatter --- doc/tutorial/mutli_arch.hpp | 28 +++++++++++++--------------- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/doc/tutorial/mutli_arch.hpp b/doc/tutorial/mutli_arch.hpp index 114262fdd9a..c70133f272f 100644 --- a/doc/tutorial/mutli_arch.hpp +++ b/doc/tutorial/mutli_arch.hpp @@ -66,10 +66,10 @@ What is going on inside `compute` ? afterward but understand that they are arbitrary. To do so, we use *Immediately Invoked Lambda Expressions* to perform this initialisation. + Next, we load the `compute_kernel` symbol from the handle we just loaded. As opening and -fetching pointer to symbol can be costly, we will make it so those tasks are done once by storing -both results in static variables. + fetching pointer to symbol can be costly, we will make it so those tasks are done once by + storing both results in static variables. + Once everything is loaded and setup, we call the function pointer with the appropriate -parameters. + parameters. Obviously, in realistic settings you would actually care about runtime issues, check that every pointer are non-null, and use functions like `dlerror` to find out what would have caused an issue. @@ -135,18 +135,18 @@ Let's get into the details: **EVE** installation path, you may have to specify `CMAKE_PREFIX_PATH` for CMake to find it. + We define an INTERFACE library that use the main **EVE** interface and add some customisation. + We call `eve_build_variants` to build a set of targets. The base name will be `compute` and -there are three different target suffixes: `basic`, `advanced`, `perfect`. For each of those, the -following options will be used: `-msse2`, `-msse4.1`, `-march=haswell`. Each target is compiled -with its corresponding options. In the end, we expect three libraries to be compiled: -`libcompute_basic.so`, `libcompute_advanced.so`, and `libcompute_perfect.so`. Notice how the -suffixes are arbitrary. They just need to correspond to any naming scheme you see fit and those -names will have to be used in the dynamic loading function. + there are three different target suffixes: `basic`, `advanced`, `perfect`. For each of those, + the following options will be used: `-msse2`, `-msse4.1`, `-march=haswell`. Each target is + compiled with its corresponding options. In the end, we expect three libraries to be compiled: + `libcompute_basic.so`, `libcompute_advanced.so`, and `libcompute_perfect.so`. Notice how the + suffixes are arbitrary. They just need to correspond to any naming scheme you see fit and those + names will have to be used in the dynamic loading function. + To perform this compilation, we use the `setup` interface we customized earlier. + We add our executable target that just compiles a main file and the compute function. To be sure we don't forget to compile the libraries when we compile the main executable, we use the -exported `eve_compute_variants` macro that contains the list of targets created by -`eve_build_variants`. Because we use `dlopen`, we need to link with the appropriate library which is -conveniently provided by `CMAKE_DL_LIBS`. + exported `eve_compute_variants` macro that contains the list of targets created by + `eve_build_variants`. Because we use `dlopen`, we need to link with the appropriate library + which is conveniently provided by `CMAKE_DL_LIBS`. That's all. Once generated, this CMake file will let you compile the `multi-arch` target that will trigger the compilation of the three libraries. Once compiled, the execution of the multi-arch @@ -158,9 +158,7 @@ executable will produce some output: >> compute with: X86 AVX2 (with FMA3 support) >> eve::wide is: 8 elements large. After: -2 2.82843 3.4641 -4 4.47214 4.89898 5.2915 5.65685 6.32456 8.94427 10.9545 12.6491 14.1421 15.4919 16.7332 17.8885 18.9737 -20 63.2456 200 +2 2.82843 3.4641 4 4.47214 4.89898 5.2915 5.65685 6.32456 8.94427 10.9545 12.6491 14.1421 15.4919 16.7332 17.8885 18.9737 20 63.2456 200 ``` The complete project is available as in the `examples/multi-arch` folder. As an exercise,