Leveraging Deep Learning for intelligent Loop Unrolling in LLVM
- Installed Tools: Have python3, pip, LLVM, clang, and opt installed and available.
- Dependencies: Install the necessary python dependencies:
pip install -r requirements.txt
. May need to additioanlly do:pip install scipy --upgrade
andpip install xgboost --upgrade
. - Input File: Have a .c source file to be processed
Naviagte to the build directory of DeepUnrollFunction
or create one if one does not exist. Then configure and build the project.
cd DeepUnroll/DeepUnrollFunction/build
cmake ..
make
Note: If there are errors when running cmake .. or make, run the following in the build directory and then redo the commands above.
make clean
rm -rf *
In the /DeepUnroll/Models
directory, there are 3 models to choose from. Once selecting a model:
-
For the chosen model, move those files into the
/DeepUnroll/DeepUnrollFunction
directory. -
In
/DeepUnroll/DeepUnrollFunction/DeepUnrollPass/Pass.cpp
, go to the TODO and select the command that mathces your model and comment the rest out:std::string command = "python3 scriptBoost.py " + filename; std::string command = "python3 script.py " + filename; std::string command = "python3 scriptBERT.py " + InputFileName + " " filename;
Note: The script we will call will run this DeepUnroll LVM pass, which in turn calls our model. The model relies on pre-trained parameters such as 'xgb_model.json' and 'scaler.pkl'.
Select if you want to run the evaluation for 1 file (evaluation.sh
) or for a directory (directoryEval.sh
), which are both location in DeepUnroll/DeepUnrollFunction
.
Run the evaluation.sh
script like so:
./evaluation.sh <path to .c file>
OR
Run the directoryEval.sh
script like so:
./directoryEval.sh <path to directory of .c files>
Terminal output giving execution times for these cases: O0,O1,O2,O3,optimized,optimized.O1,optimized.O2,optimized.O3
The main output is a results.csv
file with these headers: O0,O1,O2,O3,optimized,optimized.O1,optimized.O2,optimized.O3
O0_executable
: Executable compiled with no optimizations (-O0).O1_executable
: Executable compiled with basic optimizations (-O1).O2_executable
: Executable compiled with moderate optimizations (-O2).O3_executable
: Executable compiled with aggressive optimizations (-O3).
optimized_executable
: Executable generated after applying the custom LLVM pass without further optimizations.
optimized.O1_executable
: Executable generated by applying the custom LLVM pass, followed by -O1 optimizations.optimized.O2_executable
: Executable generated by applying the custom LLVM pass, followed by -O2 optimizations.optimized.O3_executable
: Executable generated by applying the custom LLVM pass, followed by -O3 optimizations.
Run python3 analysis.py <results file>
to generate summary stats on the reusults generated from directoryEval.sh
. This file with remove outliers and calculate the average for each column.
- LLVM Pass Shared Object(.so): Ensure that your custom LLVM pass is compiled into a shared object file.
- Input Directory: Have a directory (e.g., ./codeFiles) containing .c source files to be processed.
- Installed Tools: Have LLVM, clang, and opt installed and available.
Naviagte to the build directory of LoopUnrollFunction
or create one if one does not exist. Then configure and build the project.
cd DeepUnroll/LoopUnrollFunction/build
cmake ..
make
Note: If there are errors when running cmake .. or make, run the following in the build directory and then redo the commands above.
make clean
rm -rf *
Create a directory (e.g., codeFiles) in the project root and add all your .c source files to it. NOTE: There are premade folders with names that can be used so this step can be SKIPPED. The following is just how to move files to a more convenient location if desired.
mkdir codeFiles
cp /path/to/source/files/*.c ./codeFiles/
Usefule note: To get number of files in a folder, run:
find path/to/dir/ -type f | wc -l
Run the process_and_benchmark.sh
script like so:
./process_and_benchmark.sh <directory_path> <offset> <iterations>
<directory_path>
: Path to the directory containing .c files (e.g., ./codeFiles).<offset>
: Start processing from the nth file in sorted order.<iterations>
: Number of files to process after the offset.
NOTE: At the top of the script, there's a TIMEOUT_DURATION hyperparamter you can choose. If a file with a specific unrollment factor can't compile into byte code or into an executable after TIMEOUT_DURATION seconds, the loop unrolling factor will be skipped.
If you want to run this as a process that will continue even if you get logged out of your server, use nohup like so:
nohup ./process_and_benchmark.sh ../Andre 0 200 > output.log 2>&1 &
Output Files:
- IR files (.ll) and executables (_exec) are generated temporarily and cleaned up after processing.
Result File:
- The script appends the fastest loop unroll factor for each loop to a specified output file named at the top of the script. By defauly, it's
"${DIRECTORY##*/}_output.txt"
Building and compiling in this context involve creating a shared object (.so
) file for your pass that can be dynamically loaded into the opt tool. The opt tool is the command-line utility provided by LLVM for working with its Intermediate Representation (IR).
cd build
cmake ..
make
Compile a source file (e.g., test.c
) into LLVM IR using clang:
clang -emit-llvm -S -Xclang -disable-O0-optnone test.c -o test.ll
Note: The optnone attribute, added by Clang when compiling with -O0, instructs the LLVM optimizer and analysis passes to skip processing functions entirely, preserving their exact structure for debugging. While useful for debugging, this attribute prevents custom LLVM passes from analyzing or transforming these functions. Using -Xclang -disable-O0-optnone ensures unoptimized IR is generated without the optnone attribute, allowing all functions to be fully processed by LLVM passes while maintaining their unoptimized state for analysis.
Use the opt
tool to apply the pass (e.g., GatherDataFunctionPass
) to the LLVM IR:
opt -disable-output -load-pass-plugin=./build/GatherDataFunctionPass/GatherDataFunctionPass.so -passes="loop-info" test.ll
or
opt -disable-output -load-pass-plugin=./build/LoopUnrollFunctionPass/LoopUnrollFunctionPass.so -passes="loop-unroll-emit" test.ll
or
opt -disable-output -load-pass-plugin=./build/DeepUnrollPass/DeepUnrollPass.so -passes="deep-unroll" -input-filename=test.c test.ll
If there's json out, you can prettify the json output:
python3 -m json.tool output.json > pretty_output.json
Depth,BasicBlocks,TotalInstructions,MemoryOperations,BranchInstructions,PHINodes,FlowDeps,AntiDeps,OutputDeps,InputDeps,IsSimplified,NormalizedBlockFrequencies,BranchProbabilities, IDEAL LOOP UNROLL FACTOR
Server/Execution Requirements - Install LLVM 18.1.8 Python Requirements - 'pip3 install llvmlite'