-
-
Notifications
You must be signed in to change notification settings - Fork 162
Survey of Config Languages
Ian Wrzesinski edited this page Dec 6, 2024
·
58 revisions
Config languages let you provide parameters to software, usually at runtime. For example:
- What flags should we pass to an executable?
-
--port 80
to a web server, or--port 8080
? -
-O0
to a compiler for debug builds, or-O2
for release builds?
-
- What environment variables?
PATH
,PYTHONPATH
, etc. - What kind of container or VM should a program run in?
- How should a program be scheduled? (both locally and remotely)
This page is organized by least expressive to most expressive:
- Languages for String Data
- Languages for Typed Data
- Programmable String-ish Languages
- Programmable Typed Data
- Internal DSLs in General Purpose Languages
This doesn't imply anything about what solution you should use!
Oils is on the right of that spectrum: Hay Ain't YAML - Custom Languages for Unix Systems. It's the only config language embedded in a shell! It uses a staged evaluation model.
This page is editable, so feel free to add links below. And feel free to add notable usages, especially under Who Uses it?
- The informal INI file format expresses key-value pairs in sections.
- It's used by git for
.git/config
, and I believe Mercurial. - the Desktop Entry Specification is a widely used formal extension; particularly note locale strings.
- It's used by git for
-
XML started as a document language
- But it's often used in IDE "project files", like Eclipse and Visual Studio.
- It has a model of tags and attributes, but no arrays, integers, booleans, etc.
- But if an XML schema is specified, it may provide such features.
-
YAML is (surprisingly) the de facto control plane language of the cloud!
- Since it also supports object serialization, unsafe deserialization is possible.
- Although it has JSON-like types, it's arguably string-ish because it has type confusion, e.g. the boolean
no
vs the string"no"
. - Almost all continuous integration / build services use it: sourcehut, Gitlab CI, Github Actions, Circle CI, Travis CI, etc.
- Google App Engine uses it for
app.yaml
- Kubernetes uses it, often with Go Templates (e.g. Helm)
- One of the original creators wrote a blog post in 2020 with some of YAML's early history.
-
NestedText is visually similar to YAML, but "it only supports one scalar type: strings."
- "As such, quoting strings is unnecessary, and without quoting there is no need for escaping."
- It's one of a few formats used by Parametrize From File.
- Some languages support a limited subset of shell variable syntax. Assignments can refer to previously-defined variables simply, or perhaps with limited extensions.
-
environment.d files support
:+
and:-
-
environment.d files support
These languages allow you to express structured and dynamically typed data, but they aren't programmable.
-
JSON is derived from JavaScript, but it's now available in almost every language. It's used by node.js in
package.json
.- JavaScript Object Notation is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate.
- JSON5 is an extension of JSON that allows trailing commas and comments, among other things. Used by Chromium, Next.js, WebStorm, etc.
- JSONC a more minimal extension which only adds comments. Used by VS Code and Deno.
-
TOML looks like an .INI file, but it has JSON-like types. It's used by Rust in
Cargo.toml
.- A config file format for humans. TOML aims to be a minimal configuration file format that's easy to read due to obvious semantics.
- p-list is a family of formats used to store serialized objects on [NeXTSTEP]-derived systems. In particular, Apple tools widely use it for configuration.
-
sdlang -- appears to be used by Oracle, Bank of America, etc.?
- SDLang is a simple and concise way to textually represent data. It has an XML-like structure – tags, values and attributes – which makes it a versatile choice for data serialization, configuration files, or declarative languages. Its syntax was inspired by the C family of languages
- kdl -- a close relative of SDLang with a specification and a nice looking website.
- ron -- simple readable data serialization format that looks similar to Rust syntax
-
Go templates are for arbitrary text and HTML.
- They are often used to generate variants of a YAML file.
- M4 is used by autotools to generate shell and Make.
- CMake generates Make and Ninja files.
-
Cue
- Validate, define, and use dynamic and text-based data
- The Logic of CUE -- Types are Values; The Value Lattice
- CUE History -- Although it is a very different language, the roots of CUE lie in GCL, the dominant configuration language in use at Google. It was originally designed to configure Borg, the predecessor of Kubernetes.
- Who uses it?
-
Dhall
- Dhall is a programmable configuration language that you can think of as: JSON + functions + types + imports
- Note: Dhall isn't turing complete, but this restriction has no useful engineering properties.
- Unconstrained side effects are the thing that matter for configuration, not computation.
- Who uses it?
-
HCL (Hashicorp Configuration Language) is used by Terraform for AWS. Similar to UCL.
- *HCL is a toolkit for creating structured configuration languages that are both human- and machine-friendly, for use with command-line tools. Although intended to be generally useful, it is
-
Jsonnet - The Data Templating Language.
- Generate config data; Side-effect free; Organize, simplify, unify; Manage sprawling config
-
Nickel -- influenced by Nix, but it's independent of an application.
- Write complex configurations. Modular, correct and boilerplate-free
- Merge; Verify & Validate; Reuse
- Nix Expression Language is used to write Nix package definitions.
- Pkl – "An embeddable configuration language which provides rich support for data templating and validation. It can be used from the command line, integrated in a build pipeline, or embedded in a program."
-
Starlark
- Starlark is a [dialect of Python] intended for use as a configuration language. It was designed for the Bazel build system
-
UCG (Universal Grammar for Configuration) is designed to solve the "templating" problem for JSON, YAML, etc.
- Templates can be difficult to manage without introducing hard to see errors in the serialization format they are generating. Most templating engines aren't aware of the format they are templating
- Who uses it?
-
UCL (Universal Configuration Language)
- UCL is heavily infused by nginx configuration as the example of a convenient configuration system. However, UCL is fully compatible with JSON format and is able to parse json files.
- Who uses it? primarily targeted towards devops tools, servers, etc.*
-
ytt (YAML Templating Tool)
- template and patch YAML and text files using Starlark for flow control
- modular, hermetic, deterministic, side-effect free
- guaranteed valid YAML output (handles value escaping and indentation); aware of YAML structure
- Who uses it?
- originally designed by VMware for Carvel — packaging, distribution, and deployment for Kubernetes
Proprietary / non-open-source: Google's BCL
-
Hay Ain't YAML - Custom Languages for Unix Systems. The only config language that's embedded in a shell! It uses a staged evaluation model.
- Oil adds the missing declarative part to shell.
- We need a better control plane language for the cloud.
- Distributed systems are just code and data / processes and files. We avoid introducing concepts that don't compose.
- G-Expressions in Guile Scheme are used by the Guix distro
- Ruby blocks in Vagrant, Rake, etc.
-
Tcl commands can also be used to define data.
- Data Definition and Code Generation in Tcl (2003, PDF)
- Related surveys
- Notes on Oil: Config Dialect / Hay
- lobste.rs thread on Hay, with more comparisons