Skip to content

Latest commit

 

History

History
271 lines (193 loc) · 7.49 KB

README.textile

File metadata and controls

271 lines (193 loc) · 7.49 KB

lrun

Run programs on Linux with resources (ex. time, memory, network, device, syscall, etc.) limited.

Dependencies

Runtime dependencies

  • linux: (>= 2.6.26 minimal, >= 3.12 recommended) you can check kernel config using utils/check_linux_config.rb.
  • libseccomp: (optionally, 2.x) to enable syscall filtering feature.

Build dependencies

  • rake: The main lrun binary requires Rakefile to build.
  • g++: The code is in C++. g++ 4.6 and above is recommended. g++ 4.4 or clang++ should work as well.
  • install: To install binaries.
  • pkg-config: Get information about libseccomp (optional, but recommended).
  • git: Extract version information (optional, but recommended).

Installation dependencies

  • groupadd: Create the lrun group.
  • sudo: Install via a non-root user (optional).

Installation

Build from source

make install  # or: cd src && rake install

Configuration

lrun does not have any config files. However, non-root users must be added to the lrun group before being able to run lrun:

gpasswd -a username lrun

Note: On Linux <= 3.5, if sudo is installed, a user in lrun group can use lrun for privilege escalation.

Build options

There are several environment variables which can affect building process:

  • PREFIX: Install destination. Default is /usr/local.
  • CXX: The C++ compiler. For example, clang++ or g++.
  • CXXFLAGS: Flags used for C++ compiler. Default is -O2 -Wall.
  • INSTALL: The install binary.
  • LRUN_GROUP: The group which have access to run lrun directly. Default is lrun.
  • NDEBUG: If set, remove some debug code and produce smaller executable.
  • NOSECCOMP: If set, always build without libseccomp support.

Archlinux

Archlinux users can install lrun from AUR:

yaourt -S lrun

Usage

lrun --help

Output Format

lrun writes its final output to fd 3. This makes it easier to pass stdin, stdout, stderr directly to the child process. If the child process gets executed, fd 3 output looks like below (without # comments):

MEMORY   int         # in bytes
CPUTIME  float       # in seconds
REALTIME float       # in seconds
SIGNALED int         # one of: 0, 1. 1 means the process is signaled (exit abnormally)
EXITCODE int         # exit code
TERMSIG  int         # signal number, 0 if not signaled
EXCEED   exceed_enum # one of: none, CPU_TIME, REAL_TIME, MEMORY, OUTPUT

If the child process does not get executed (ex. the path does not exist), nothing will be written to fd 3.

Examples

Limit time

% lrun --max-cpu-time 1.5 bash -c ':(){ :;};:' 3>&1
MEMORY   10461184
CPUTIME  1.500
REALTIME 1.507
SIGNALED 0
EXITCODE 0
TERMSIG  0
EXCEED   CPU_TIME
% lrun --max-real-time 1.0 sleep 2 3>&1
MEMORY   393216
CPUTIME  0.001
REALTIME 1.000
SIGNALED 0
EXITCODE 0
TERMSIG  0
EXCEED   REAL_TIME

Limit memory

% lrun --max-memory 1000000 gedit 3>&1
MEMORY   1000000
CPUTIME  0.003
REALTIME 0.020
SIGNALED 0
EXITCODE 0
TERMSIG  0
EXCEED   MEMORY

Restrict network

% lrun --network true /sbin/ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:26:82:af:cf:75 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.3/24 brd 192.168.1.255 scope global wlan0
    inet6 fe80::226:82ff:feaf:cf75/64 scope link
       valid_lft forever preferred_lft forever

% lrun --network false /sbin/ip addr
205: lo: <LOOPBACK> mtu 16436 qdisc noop state DOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

Note: On some kernels, creating an empty network namespace takes a global lock and can hurt parallelism. To workaround that, run lrun-netns-empty --create (or sudo ip netns add lrun-empty) once after reboot, then use lrun --netns lrun-empty instead of lrun --network false.

Isolate processes

% lrun --isolate-process false bash -c 'echo $$'
10140

% lrun --isolate-process true bash -c 'echo $$'SCMP_ARCH_X86_64
2  # or 1, see Note below

On Linux >= 3.8, the user process won’t run as pid 1. Instead, a dummy init process is spawned and the user process will run as pid 2. This avoids some potential issues because pid 1 has some special behaviors.

Change uid

% sudo lrun --uid 2000 --gid 200 /usr/bin/sudo ls
sudo: unknown uid 2000: who are you?

Non-root users cannot use --uid and --gid and root must provide these two options.

Mount tmpfs

% lrun ls /usr
NX  bin  i486-mingw32  include	lib  lib32  local  man	sbin  share  src  x86_64-unknown-linux-gnu

% lrun --tmpfs /var 40960 df /var
Filesystem     1K-blocks  Used Available Use% Mounted on
none                  40     0        40   0% /usr

% lrun --tmpfs /tmp 0 touch /tmp/abc 3>&1
touch: cannot touch `/tmp/abc': Read-only file system
MEMORY   262144
CPUTIME  0.001
REALTIME 0.090
SIGNALED 0
EXITCODE 1
TERMSIG  0
EXCEED   none

There is also --bindfs. Non-root users can only mount A to B if they can read A.

Syscall filter

This requires libseccomp >= 2.0, at both compile and run time.

% lrun readlink /lib
usr/lib
% lrun --syscalls '!readlink' readlink /lib 3>&1
MEMORY   262144
CPUTIME  0.000
REALTIME 0.070
SIGNALED 0
EXITCODE 1
TERMSIG  0
EXCEED   none

File-open filter

% lrun --fopen-filter f:/etc/fstab d cat /etc/fstab
cat: /etc/fstab: Operation not permitted
% lrun --fopen-filter 'm:/proc:^/proc/.*stat.*$' d wc -l /proc/self/status
wc: /proc/self/status: Operation not permitted
% lrun --fopen-filter 'm:/proc:^/proc/.*stat.*$' d wc -l /proc/self/io
7 /proc/self/io

Realtime status

Use --status to show realtime cpu, memory usage information:

% lrun --status firefox

Utilities

There are some related utilities in utils directory. You may find some of them helpful.

mirrorfs

A utility helps to set up chroot environments by mirror partial of the current filesystem. The binary is available as lrun-mirrorfs in deb package.

Testing

cgroup v2 support is tested on Ubuntu 22.04. It’s a good idea to run the testing in container. You can take test.Dockerfile as a starting point.
Some useful issues:


https://github.com/moby/moby/issues/43093

Troubleshooting

Error: “FATAL: can not mount cgroup memory on ‘/sys/fs/cgroup/memory’ (No such file or directory)”

You are probably using Debian. Memory controller is compiled but deactivated. Try adding cgroup_enable=memory as a kernel parameter.
When using grub2, this can be done by editing GRUB_CMDLINE_LINUX in /etc/default/grub and running update-grub2.

File-open filter cannot be used

You are probably using Debian. File-open filter requires the kernel to be compiled with CONFIG_FANOTIFY_ACCESS_PERMISSIONS. Sadly Debian refused to enable it.

dmesg prints trap ... ip:... sp:... in ... and I don’t want to see them

Try sysctl -w debug.exception-trace=0.

License

I am providing code in this repository to you under the MIT license (see LICENSE for details).
Because this is my personal repository, the license you receive to my code is from me and not from my previous employer(s) or current employer.