Loupe is a tool designed to help you analyze the system call usage of your application.
Loupe can do primarily two things: (1) collect data about the system call usage of your application(s), and (2) analyze the data collected for your application(s). It can tell you what system calls you need to run them, and visualize these numbers in a variety of plots.
Loupe is based on dynamic analysis, but it is also able to gather static analysis data. We put the emphasis on reproducible analysis: measurements are made in a Docker container.
Loupe stores analysis results in a custom database. A Loupe database is nothing more than a git repository with a particular layout. We offer an online, open database maintained by the community. Feel free to pull request your analysis results!
- Loupe supports stubbing/faking analysis (Loupe is not a passive observer)
- Loupe goes beyond system call granularity
- Loupe supports replication to obtain results stable across runs
- Loupe offers an infrastructure for reproducibility and sharing of results
Loupe is the result of a collaboration between The University of Manchester, Liège Université, University Politehnica of Bucharest, and Unikraft.io. The paper describing Loupe has been accepted to appear in ASPLOS'24.
Abstract: Supporting mainstream applications is fundamental for a new OS to have impact, and is generally achieved by developing a layer of compatibility such that applications developed for a mainstream OS like Linux can work, unmodified, on the new OS. Building such a layer, as we will show, results in large inefficiencies in terms of engineering effort due to the lack of efficient methods to precisely measure the OS features required by a set of applications.
We propose Loupe, a novel method based on dynamic analysis that determines the OS features that need to be implemented in a prototype OS to bring support for a target set of applications and workloads. Loupe guides and boosts OS developers as they build compatibility layers, prioritizing which features to implement in order to quickly support many applications as early as possible. We apply our methodology to 100+ applications and several OSes currently under development, demonstrating high engineering effort savings vs. existing approaches: for example, for the 62 applications supported by the OSv kernel, we show that using Loupe, would have required implementing only 37 system calls vs. 92 for the non-systematic process followed by OSv developers.
We further study our measurements and extract several novel key insights. Overall, we show that the burden of building compatibility layers is significantly less than what previous works suggest: in some cases, only as few as 20% of system calls reported by static analysis, and 50% of those reported by naive dynamic analysis need an implementation for an application to successfully run standard benchmarks.
Information on Loupe can also be found in Vlad-Radu Schiller's 3rd year project report.
If at all possible, please read through this entire document before installing or using Loupe. This document is best read on GitHub, with a Markdown viewer, or Markdown editor.
- 1. Hardware Dependencies
- 2. Dependencies & Install
- 3. Gathering Data
- 4. Retrieving and Processing Data
- 5. Advanced Features
- 6. Troubleshooting
- 7. Additional Documentation
- 8. Zenodo Artifact, Tags, and ASPLOS'24 Artifact Evaluation
- 9. Contributing
- 10. Disclaimer
- 11. Acknowledgements
Link to the Loupe ASPLOS'24 paper data set: [ASPLOS'24 Data Set]
Any x86 machine should do the trick - non-x86 might cause technical issues because of our Docker containers.
We recommend (but do not require) > 8 CPU cores to obtain stable performance numbers.
Loupe has been tested on Debian 12. While this is not a hard requirement (it will likely work on any Linux distribution), we have not tested other environments. Please do report an issue if Loupe does not run on your Linux-based system.
- Docker
- python3 (should work with any version of Python 3, known to work with at least 3.10.5)
- python-git (
pip3 install gitpython
, known to work with at least 3.1.27) - a recent-enough Linux kernel to support seccomp and ptrace (i.e., if your Linux kernel doesn't support them, you really seriously should update your setup 😇)
Once these dependencies have been installed, the setup is very simple: make all
loupe generate
takes care of analyzing the system call usage of your
application.
Loupe considers two types of workloads: benchmarks and test suites. Both measurements are independent. In this example, we'll illustrate both for Nginx.
Let's take a look at benchmarks first.
Step 0: first you'll need to generate the loupe-base
container image as
follows (in loupe's root directory):
$ make docker
Step 1: identify a benchmarking tool that we can use to benchmark Nginx.
Here we'll go for wrk.
Step 2: write a test script that uses wrk. Test scripts are used by Loupe to determine whether or not the application (here Nginx) works. Test scripts return 0 if the application works, 1 if it doesn't.
In the case of Nginx, our test script looks like the following:
#!/bin/bash
# nginx command: ./objs/nginx -p $(pwd) -g 'daemon off;'
bench=$(wrk http://localhost:80/index.html -d3s)
nl=$(echo ${bench} | grep -P "Transfer/sec:\s*\d+(?:\.\d+)MB" | wc -l)
if [ "$nl" -eq "1" ]; then
exit 0
fi
exit 1
This is a very simple script. We first start a 3s benchmark on port 80 with
wrk
. Then we parse the output for a chain of characters that typically
indicates success (here the throughput, which is absent if the benchmark fails).
We return 0 if it is present, and 1 otherwise.
Step 3: create a Docker container that build nginx and wrk, and performs
the system call analysis using Loupe. Use loupe-base
as basis for your Docker
container.
In our case, the Dockerfile looks like the following (slightly modified to exclude coverage, which is not necessary here):
FROM loupe-base:latest
# Install wrk
RUN apt update
RUN apt install -y wrk
# Nginx related instructions
RUN apt build-dep -y nginx
RUN wget https://nginx.org/download/nginx-1.20.1.tar.gz
RUN tar -xf nginx-1.20.1.tar.gz
RUN mv nginx-1.20.1 nginx
RUN cd nginx && ./configure \
--sbin-path=$(pwd)/nginx \
--conf-path=$(pwd)/conf/nginx.conf \
--pid-path=$(pwd)/nginx.pid
RUN cd nginx && make -j
RUN mkdir /root/nginx/logs
# Copy test script
COPY dockerfile_data/nginx-test.sh /root/nginx-test.sh
RUN chmod a+x /root/nginx-test.sh
# Run command (without coverage)
CMD /root/explore.py --output-csv -t /root/nginx-test.sh \
-b /root/nginx/objs/nginx \
-- -p /root/nginx -g "daemon off;"
This is a simple container. We install wrk
, and configure and build Nginx.
Finally, we start explore.py
. The arguments are quite important:
--output-csv
is necessary to enable parsing by Loupe-t /root/nginx-test.sh
indicates our test script-b /root/nginx-1.20.1/objs/nginx
indicates our binary--
denotes arguments that are passed on to nginx (remember that we want to callobjs/nginx -p $(pwd) -g 'daemon off;'
)
Step 4: start the analysis using the following command:
$ ./loupe generate -b -db ../loupedb -a "nginx" -w "wrk" -d ./Dockerfile.nginx
The arguments are important too:
-b
tells Loupe that you are running a benchmark, not a test suite, and-w
defines the name of the benchmark.-db
tells Loupe where the database is-a
tells Loupe the name of the software that we are benchmarking-d ./Dockerfile.nginx
passes our Dockerfile from Step 3
Gathering data for a test suite is not fundamentally different from a benchmark.
Step 1: identify the test suite. In many cases it will be in the same repository as the software itself, but the case of Nginx it is located in a separate repository.
Step 2: write a test script that parses the output of the test suite and determines whether it is a success or a failure. It is generally a very simple task: run it once manually, and search for a string that indicates success.
In the case of Nginx, our test script looks like the following:
#!/bin/bash
nl=$(cat $1 | grep -P "All tests successful." | wc -l)
if [ "$nl" -eq "1" ]; then
exit 0
fi
exit 1
Step 3: create a Docker container that build nginx and wrk, and performs
the system call analysis using Loupe. Use loupe-base
as basis for your Docker
container.
This Dockerfile will be very similar to the one you wrote for the benchmark.
In our case, the Dockerfile looks like the following:
FROM loupe-base:latest
RUN apt build-dep -y nginx
# needed to run the test suite
RUN apt install -y prove6
COPY dockerfile_data/nginx-test.sh /root/nginx-test.sh
RUN chmod a+x /root/nginx-test.sh
# the test suite cannot be run as root - don't ask me why
RUN useradd -ms /bin/bash user
USER user
RUN mkdir /tmp/nginx
WORKDIR /tmp/nginx
# Nginx related instructions
RUN wget https://nginx.org/download/nginx-1.20.1.tar.gz
RUN tar -xf nginx-1.20.1.tar.gz
RUN cd nginx-1.20.1 && ./configure \
--sbin-path=$(pwd)/nginx \
--conf-path=$(pwd)/conf/nginx.conf \
--pid-path=$(pwd)/nginx.pid
RUN cd nginx-1.20.1 && make -j && mkdir logs
RUN cd nginx-1.20.1 && sed -i "s/listen 80/listen 8034/g" conf/nginx.conf
# necessary for the test suite
RUN mv nginx-1.20.1 nginx
# test suite
RUN git clone https://github.com/nginx/nginx-tests.git
CMD cd /tmp/nginx/nginx-tests && /root/explore.py --output-csv \
--only-consider /tmp/nginx/nginx/objs/nginx \
--timeout 600 --test-sequential -t /root/nginx-test.sh \
-b /usr/bin/prove -- -m .
Step 4: start the analysis using the following command:
$ ./loupe generate -s -db ../loupedb -a "nginx" -w "nginx-tests" -d ./Dockerfile.nginx
As you can see, the only difference is the -s
instead of -b
to indicate
that this is a testsuite.
Note: for test suites, -w
can be empty.
Existing runs can be easily reproduced. In order to reproduce Example 1:
$ cd loupedb/nginx/benchmark-wrk/7883824b5cbef4f66dd1c9bdcf7d6185
$ loupe generate -b -db ../../../../loupedb -a "nginx" -w "wrk" -d ./Dockerfile.nginx
git diff
can then be used to visualize changes.
In practice, you likely want to write your Docker containers a little bit more carefully to ensure that the analysis remains stable and reproducible over time. We provide recommendations in GOOD_DOCKERFILES.md
.
Once the analysis completed, you can check the database (loupedb
in the
previous examples). You will see a number of uncommitted changes - they are the
results of the analysis. The database is formatted in a custom text-based
layout documented here.
As a user, you do not need to understand this format; loupe search
takes care
of analyzing the data in the database for you.
Loupe can simply output the database's raw data:
$ ./loupe search --show-usage -db ../loupedb -a "nginx" -w benchmark
[I] Checking database...
Required:
[0, 3, 9, 17, 18, 20, 41, 45, 49, 50, 53, 54, 59, 158, 213, 232, 233, 257, 262]
Can be stubbed:
[8, 10, 13, 16, 33, 39, 56, 63, 72, 83, 105, 106, 302]
Can be faked:
[288]
Can be both stubbed or faked:
[1, 11, 12, 14, 21, 40, 42, 92, 107, 110, 116, 157, 218, 273, 290, 318, 334]
- You can replace
nginx
with any other application name, list of names (comma-separated), or a*
for all. - You can replace
benchmark
withsuite
to obtain data for the test-suite, or*
for both.
Loupe can generate the paper's plot using information from the database.
Simply run
$ ./loupe search --cumulative-plot -db ../loupedb -a "*" -w suite
to output a cumulative plot of the database's content for test-suites and all applications.
Similarly, run
$ ./loupe search --heatmap-plot -db ../loupedb -a "*" -w suite
to output a heatmap plot for the same data set.
Similarly, run
$ ./loupe search --paper-histogram-plot -db ../loupedb
to output the histogram plot of the paper.
To get an optimized order of syscall implementation/faking/stubbing for a given OS (characterized by a set of already-supported system calls) towards a particular set of applications:
$ ./loupe search -db ../loupedb --guide-support <already supported syscalls> --applications <apps> --workloads <workload>
Here, <already supported syscalls>
is a newline-separated list of syscalls
already supported by the target OS (see the *.syscalls
files at the root
of loupedb
for examples), <apps>
is the list of application to target,
and <workload>
is bench
or suite
.
Run:
$ make paperplots
Generated plots will be located under paperplots
.
Here we describe advanced features supported by Loupe.
loupe
main wrapper, and may not be stable.
We provide a script that integrates Loupe with debhelper
.
Our script automatically downloads the debian sources of a package, builds it and then runs the test suite with loupe (if supported by the package).
We use dh_test_auto
to run the testsuite. dh_test_auto
returns 0 if the test suite executed successfully.
Example of usage with memcached:
sudo docker build --tag loupe-dbhelper --build-arg APP=memcached --build-arg BINARY=memcached-debug -f debhelper/Dockerfile.debhelper .
Note the two variables passed, APP
and BINARY
.
APP
is the identifier for the debian package that we are targeting, in this casememcached
.BINARY
is the resulting binary that we will be analysing, in this casememcached-debug
is being run by the memcached suite automatically.
The next step is to run loupe:
sudo docker container run --rm --privileged -e "BINARY=memcached-debug" -e "APP=memcached" -it loupe-dbhelper
Or if you would want to run loupe manually from the container:
sudo docker container run --rm --privileged -e "BINARY=memcached-debug" -e "APP=memcached" -it loupe-dbhelper /bin/bash
TODO: We can get from the automatic build all the binaries being buit, and such we could pass them to BINARY
without user input.
Although we did not deploy this at a large scale, it is known to work with, at least: LigHTTPd, webfsd, Memcached.
Loupe can analyze, for each system call stubbed/faked, the impact on performance and resource usage.
For that...
- the test script must be extended to support benchmarking (usually very simple!);
- and
explore.py
must be called with particular arguments.
Step 1: Add support to the test script. The test script may be passed an
argument. If the value of the argument is benchmark
, then the script is asked
to output a performance number to stdout. If it doesn't, the performance
evaluation will fail. Taking the previous example with Nginx and wrk
, we obtain:
#!/bin/bash
# nginx command: ./objs/nginx -p $(pwd) -g 'daemon off;'
WRK_PATH=wrk
PORT=80
test_works() {
bench=$(${WRK_PATH} http://localhost:${PORT}/index.html -d3s)
failure=$(echo ${bench} | grep -P "unable to connect to localhost" | wc -l)
if [ "$failure" -eq "1" ]; then
exit 200
fi
nl=$(echo ${bench} | grep -P "Transfer/sec:\s*\d+(?:\.\d+)MB" | wc -l)
if [ "$nl" -eq "1" ]; then
exit 0
fi
exit 1
}
benchmark() {
taskset -c 6 ${WRK_PATH} http://localhost:${PORT}/index.html -d10s | \
grep -P "Requests/sec:\s*\d+(?:\.\d+)" | grep -Po "\d+(?:\.\d+)"
exit $?
}
if [ "$2" == "benchmark" ]; then
benchmark
else
test_works
fi
No need to modify the script to support resource usage analysis, this is done automatically.
You can find other examples of scripts that support performance measurement here and here.
Step 2: We now want to call explore.py
as following:
$ /root/explore.py --perf-analysis -t /root/nginx-test.sh -b /root/nginx/objs/nginx -- -p /root/nginx -g "daemon off;"
Note that support is not integrated in the loupe
main program. In order to
run this, manually start the Docker contain built previously while running the
analysis, and manually run:
$ docker run -it docker.io/library/nginx-loupe bash
$ /root/explore.py --perf-analysis -t /root/nginx-test.sh -b /root/nginx/objs/nginx -- -p /root/nginx -g "daemon off;"
This will output the performance and resource usage impact detailed analysis per-system call.
TODO: Integrate performance and resource usage analysis in the main loupe
wrapper.
Results: We provide a set of example raw results obtained with Nginx, Redis, and iPerf3 here.
You may be wondering what portion of the code a particular workload is exercising under a full Loupe analysis. Here is how to obtain a precise report.
Step 1: Check the Dockerfile of the application. If it already contains a
version of the software built with coverage, then proceed to step 2. Otherwise,
build the software (or a copy of it if performance measurements are planned)
with CFLAGS -fprofile-arcs -ftest-coverage
and LDFLAGS -lgcov
. An example
is visible here.
Step 2: Build the application container, e.g., for nginx:
$ ./loupe generate -b -db ../loupedb -a "nginx" -w "wrk" -d ./Dockerfile.nginx --only-build-docker
We don't need to perform the actual system call measurement, we only need the
container to be built, so we pass --only-build-docker
.
Step 3: Enter the container and generate coverage. It's as simple as running the instrumented software. For example, with Nginx:
$ docker run -it nginx-loupe:latest /bin/bash
root@b0ef11e440d0:~# # we're in the container, generate coverage data
root@b0ef11e440d0:~# /root/explore.py --output-csv -t /root/nginx-test.sh \
-b /root/nginx-coverage/objs/nginx \
-- -p /root/nginx -g "daemon off;"
root@b0ef11e440d0:~# # done. check the coverage data:
root@b0ef11e440d0:~# ls nginx-coverage/objs/src/core/
[ shows a number of .gcda and .gcno files ]
Step 4: Generate an lcov report. Example with Nginx, still in the same container:
root@b0ef11e440d0:~/nginx-coverage# lcov --capture --directory . --output-file cov.info
[shows files considered, if you get WARNINGs, investigate, they might be
invalid files considered in the coverage, like autotest.gcda at the root.
You can safely remove them.]
root@b0ef11e440d0:~/nginx-coverage# genhtml cov.info --output-directory out_html \
--demangle-cpp --legend \
--title "Nginx coverage wrk"
[...]
Overall coverage rate:
lines......: 13.0% (5291 of 40728 lines)
functions..: 20.1% (282 of 1405 functions)
root@b0ef11e440d0:~/nginx-coverage# ls out_html/
amber.png event http index.html snow.png
core gcov.css index-sort-f.html os updown.png
emerald.png glass.png index-sort-l.html ruby.png usr
The resulting html report is there.
This is a list of known issues, along with their solution. If your issue is not
in this list, feel free to open a bug report and contribute a
fix. This README is not exhaustive,
before opening a bug report, please check the help feature of the relevant binary or
script with -h
.
Issue 1: You want to rebuild the base container, but running make docker
doesn't do anything.
Solution: You can use make rebuild-docker
instead of make docker
.
Issue 2: Loupe fails with the following error message:
[E] Database /home/hle/Development/loupedb is dirty; commit your changes before running this tool.
Solution: You should either commit your changes to the database, or ignore the changes with --allow-dirty-db
Issue 3: The container hangs while building, with the following error:
Please select the geographic area in which you live. Subsequent configuration
questions will narrow this down by presenting a list of cities, representing
the time zones in which they are located.
1. Africa 4. Australia 7. Atlantic 10. Pacific 13. Etc
2. America 5. Arctic 8. Europe 11. SystemV
3. Antarctica 6. Asia 9. Indian 12. US
Geographic area:
Solution: We likely forgot to add ARG DEBIAN_FRONTEND=noninteractive
in the Dockerfile. Try to add it, it should address the issue.
Issue 4: Performance measurement shows with the following error:
[E] Bug here! Failed to determine VmPeak.
...and the value for the memory usage of one or more system calls is -1.
Solution: It is likely that the program is crashing during performance analysis, and that Loupe is consequently unable to measure the peak process memory usage. You should assume that those system calls cannot be faked or stubbed.
Issue 5: Performance measurement shows with the following error:
[E] Bug here! VmPeak not in kB:${something}
...and the value for the memory usage of one or more system calls is -1.
Solution: This is a bug in Loupe. Please submit a bug report.
Issue 6: I cannot reproduce certain measurements, despite of running the reproduce command as instructed in the README.md.
Solution: Although infrequently happening, this is a known issue. The reason varies; it can be due to the Loupe code having changed since the run was performed, to the Docker container not being perfectly reproducible (e.g., versions of the application, shared libraries not being fixed properly), to the kernel having changed, etc.
Loupe has been developed over a span of 3 years, during which we generated various results, not all taking advantage of the full set of features that Loupe offers in its current state.
We are working on implementing best practices over the data set.
Issue 7: Loupe fails to build the Docker container with 404 Not Found
errors on an APT command.
Solution: The APT command is likely missing a apt-get update -q -y &&
prefix - however you do not necessarily need to modify the Docker container. You can simply rebuild the base container with make rebuild-docker
and re-run Loupe.
Issue 8: Loupe output messages like "write /dev/stdout: broken pipe" during the analysis.
Solution: This is a known issue - Loupe is not redirecting the output of all
children to /dev/null
during the analysis. You can safely ignore this
message.
In addition to this README, interested readers may want to take a look at...
STRUCTURE.md
, which describes the structure of this repositoryDATABASE_FORMAT.md
, which describes the structure of a Loupe databaseGOOD_DOCKERFILES.md
, which provides advice on writing Dockerfiles for reproducible Loupe analysis
In addition to this repository, we have archived this artifact on Zenodo. In
order to make the Zenodo artifact as self-contained as possible, we included a
copy of the Loupe database along with
this repository. These are provided for the main purpose of archival. You can
generate a new snapshot with make zenodo
.
We tagged both repositories with asplos24-ae-v1
before submission. Future
iterations on the artifact will be tagged with -v2
, etc.
We welcome contributions from anyone. This is free software.
To contribute to this repository, please fork and submit a Pull-Request. If you
introduce a new file, make sure to add an SPDX license header. If you do
significant-enough changes, consider adding yourself to COPYING.md
.
Here are a few ideas of contributions to get started:
- Add support to convert Loupe databases to SQLite.
- Fix open bug reports.
This artifact is the first release of a research proof-of-concept for Loupe. Like any research prototype, it contains hacks, bugs, and TODOs. Please use it with a critical eye. We hope that it will be useful!
This artifact would not exist without the infrastructure and hard work of the Unikraft community. We encourage interested researchers to visit the project's web page and GitHub.