diff --git a/config-linux.md b/config-linux.md index c543c4b9d..7d683994b 100644 --- a/config-linux.md +++ b/config-linux.md @@ -15,49 +15,3 @@ Valid values are the strings for capabilities defined in [the man page](http://m "CAP_NET_BIND_SERVICE" ] ``` - -## User namespace mappings - -```json - "uidMappings": [ - { - "hostID": 1000, - "containerID": 0, - "size": 10 - } - ], - "gidMappings": [ - { - "hostID": 1000, - "containerID": 0, - "size": 10 - } - ] -``` - -uid/gid mappings describe the user namespace mappings from the host to the container. -The mappings represent how the bundle `rootfs` expects the user namespace to be setup and the runtime SHOULD NOT modify the permissions on the rootfs to realize the mapping. -*hostID* is the starting uid/gid on the host to be mapped to *containerID* which is the starting uid/gid in the container and *size* refers to the number of ids to be mapped. -There is a limit of 5 mappings which is the Linux kernel hard limit. - -## Default Devices and File Systems - -The Linux ABI includes both syscalls and several special file paths. -Applications expecting a Linux environment will very likely expect these files paths to be setup correctly. - -The following devices and filesystems MUST be made available in each application's filesystem - -| Path | Type | Notes | -| ------------ | ------ | ------- | -| /proc | [procfs](https://www.kernel.org/doc/Documentation/filesystems/proc.txt) | | -| /sys | [sysfs](https://www.kernel.org/doc/Documentation/filesystems/sysfs.txt) | | -| /dev/null | [device](http://man7.org/linux/man-pages/man4/null.4.html) | | -| /dev/zero | [device](http://man7.org/linux/man-pages/man4/zero.4.html) | | -| /dev/full | [device](http://man7.org/linux/man-pages/man4/full.4.html) | | -| /dev/random | [device](http://man7.org/linux/man-pages/man4/random.4.html) | | -| /dev/urandom | [device](http://man7.org/linux/man-pages/man4/random.4.html) | | -| /dev/tty | [device](http://man7.org/linux/man-pages/man4/tty.4.html) | | -| /dev/console | [device](http://man7.org/linux/man-pages/man4/console.4.html) | | -| /dev/pts | [devpts](https://www.kernel.org/doc/Documentation/filesystems/devpts.txt) | | -| /dev/ptmx | [device](https://www.kernel.org/doc/Documentation/filesystems/devpts.txt) | Bind-mount or symlink of /dev/pts/ptmx | -| /dev/shm | [tmpfs](https://www.kernel.org/doc/Documentation/filesystems/tmpfs.txt) | | diff --git a/runtime-config-linux.md b/runtime-config-linux.md index f6bf82ee3..d7ced0801 100644 --- a/runtime-config-linux.md +++ b/runtime-config-linux.md @@ -9,6 +9,7 @@ Each entry has a type field with possible values described below and an optional If a path is specified, that particular file is used to join that type of namespace. Also, when a path is specified, a runtime MUST assume that the setup for that particular namespace has already been done and error out if the config specifies anything else related to that namespace. +*Example* ```json "namespaces": [ { @@ -45,6 +46,31 @@ container via system level IPC. * **user** the container will be able to remap user and group IDs from the host to local users and groups within the container. +## User namespace mappings + +uid/gid mappings describe the user namespace mappings from the host to the container. +The mappings represent how the bundle `rootfs` expects the user namespace to be setup and the runtime SHOULD NOT modify the permissions on the rootfs to realize the mapping. +*hostID* is the starting uid/gid on the host to be mapped to *containerID* which is the starting uid/gid in the container and *size* refers to the number of ids to be mapped. +There is a limit of 5 mappings which is the Linux kernel hard limit. + +*Example* +```json + "uidMappings": [ + { + "hostID": 1000, + "containerID": 0, + "size": 10 + } + ], + "gidMappings": [ + { + "hostID": 1000, + "containerID": 0, + "size": 10 + } + ] +``` + ## Devices Devices is an array specifying the list of devices to be created in the container. @@ -61,6 +87,20 @@ Next parameters can be specified: * uid - uid of device owner * gid - gid of device owner +Note: The following devices MUST be made available in each Linux application's filesystem + +| Path | Type | Notes | +| ------------ | ------ | ------- | +| /dev/null | [device](http://man7.org/linux/man-pages/man4/null.4.html) | | +| /dev/zero | [device](http://man7.org/linux/man-pages/man4/zero.4.html) | | +| /dev/full | [device](http://man7.org/linux/man-pages/man4/full.4.html) | | +| /dev/random | [device](http://man7.org/linux/man-pages/man4/random.4.html) | | +| /dev/urandom | [device](http://man7.org/linux/man-pages/man4/random.4.html) | | +| /dev/tty | [device](http://man7.org/linux/man-pages/man4/tty.4.html) | | +| /dev/console | [device](http://man7.org/linux/man-pages/man4/console.4.html) | | +| /dev/ptmx | [device](https://www.kernel.org/doc/Documentation/filesystems/devpts.txt) | Bind-mount or symlink of /dev/pts/ptmx | + +*Example* ```json "devices": [ { @@ -126,6 +166,45 @@ Next parameters can be specified: ] ``` +## Mounts + +See the [description](runtime-config.md#mount_configuration) of Mounts. + +Note: The following filesystems MUST be made available in each Linux application's filesystem + +| Path | Type | +| ------------ | ------ | +| /proc | [procfs](https://www.kernel.org/doc/Documentation/filesystems/proc.txt) | +| /sys | [sysfs](https://www.kernel.org/doc/Documentation/filesystems/sysfs.txt) | +| /dev/pts | [devpts](https://www.kernel.org/doc/Documentation/filesystems/devpts.txt) | +| /dev/shm | [tmpfs](https://www.kernel.org/doc/Documentation/filesystems/tmpfs.txt) | + +*Example* +```json +"mounts": { + "proc": { + "type": "proc", + "source": "proc", + "options": [] + }, + "dev": { + "type": "tmpfs", + "source": "tmpfs", + "options": ["nosuid","strictatime","mode=755","size=65536k"] + }, + "devpts": { + "type": "devpts", + "source": "devpts", + "options": ["nosuid","noexec","newinstance","ptmxmode=0666","mode=0620","gid=5"] + }, + "data": { + "type": "bind", + "source": "/volumes/testing", + "options": ["rbind","rw"] + } +} +``` + ## Control groups Also known as cgroups, they are used to restrict resource usage for a container and handle device access. @@ -140,6 +219,7 @@ The Spec does not include naming schema for cgroups. The Spec does not support [split hierarchy](https://www.kernel.org/doc/Documentation/cgroups/unified-hierarchy.txt). The cgroups will be created if they don't exist. +*Example* ```json "cgroupsPath": "/myRuntime/myContainer" ``` @@ -148,6 +228,7 @@ The cgroups will be created if they don't exist. Optionally, cgroups limits can be specified via `resources`. +*Example* ```json "resources": { "disableOOMKiller": false, @@ -191,6 +272,7 @@ For example, to run a new process in an existing container without updating limi sysctl allows kernel parameters to be modified at runtime for the container. For more information, see [the man page](http://man7.org/linux/man-pages/man8/sysctl.8.html) +*Example* ```json "sysctl": { "net.ipv4.ip_forward": "1", @@ -200,6 +282,7 @@ For more information, see [the man page](http://man7.org/linux/man-pages/man8/sy ## Rlimits +*Example* ```json "rlimits": [ { @@ -218,6 +301,8 @@ The kernel enforces the `soft` limit for a resource while the `hard` limit acts SELinux process label specifies the label with which the processes in a container are run. For more information about SELinux, see [Selinux documentation](http://selinuxproject.org/page/Main_Page) + +*Example* ```json "selinuxProcessLabel": "system_u:system_r:svirt_lxc_net_t:s0:c124,c675" ``` @@ -227,6 +312,7 @@ For more information about SELinux, see [Selinux documentation](http://selinuxp Apparmor profile specifies the name of the apparmor profile that will be used for the container. For more information about Apparmor, see [Apparmor documentation](https://wiki.ubuntu.com/AppArmor) +*Example* ```json "apparmorProfile": "acme_secure_profile" ``` @@ -238,6 +324,7 @@ Seccomp configuration allows one to configure actions to take for matched syscal For more information about Seccomp, see [Seccomp kernel documentation](https://www.kernel.org/doc/Documentation/prctl/seccomp_filter.txt) The actions and operators are strings that match the definitions in seccomp.h from [libseccomp](https://github.com/seccomp/libseccomp) and are translated to corresponding values. +*Example* ```json "seccomp": { "defaultAction": "SCMP_ACT_ALLOW", @@ -256,6 +343,7 @@ rootfsPropagation sets the rootfs's mount propagation. Its value is either slave, private, or shared. [The kernel doc](https://www.kernel.org/doc/Documentation/filesystems/sharedsubtree.txt) has more information about mount propagation. +*Example* ```json "rootfsPropagation": "slave", ``` diff --git a/runtime-config.md b/runtime-config.md index de3e82a13..9c489dea1 100644 --- a/runtime-config.md +++ b/runtime-config.md @@ -12,30 +12,7 @@ Only [mounts from the portable config](config.md#mount-points) will be mounted. *Example (Linux)* -```json -"mounts": { - "proc": { - "type": "proc", - "source": "proc", - "options": [] - }, - "dev": { - "type": "tmpfs", - "source": "tmpfs", - "options": ["nosuid","strictatime","mode=755","size=65536k"] - }, - "devpts": { - "type": "devpts", - "source": "devpts", - "options": ["nosuid","noexec","newinstance","ptmxmode=0666","mode=0620","gid=5"] - }, - "data": { - "type": "bind", - "source": "/volumes/testing", - "options": ["rbind","rw"] - } -} -``` +See Mounts [example](runtime-config-linux.md#mounts-in-linux) in Linux *Example (Windows)*