Skip to content

Survey of Config Languages

Ian Wrzesinski edited this page Dec 6, 2024 · 58 revisions

What is a "Config Language"? What Belongs on this Page?

Config languages let you provide parameters to software, usually at runtime. For example:

  • What flags should we pass to an executable?
    • --port 80 to a web server, or --port 8080?
    • -O0 to a compiler for debug builds, or -O2 for release builds?
  • What environment variables? PATH, PYTHONPATH, etc.
  • What kind of container or VM should a program run in?
  • How should a program be scheduled? (both locally and remotely)

This page is organized by least expressive to most expressive:

  1. Languages for String Data
  2. Languages for Typed Data
  3. Programmable String-ish Languages
  4. Programmable Typed Data
  5. Internal DSLs in General Purpose Languages

This doesn't imply anything about what solution you should use!

Oils is on the right of that spectrum: Hay Ain't YAML - Custom Languages for Unix Systems. It's the only config language embedded in a shell! It uses a staged evaluation model.

This page is editable, so feel free to add links below. And feel free to add notable usages, especially under Who Uses it?

Languages For String Data

  • The informal INI file format expresses key-value pairs in sections.
  • XML started as a document language
    • But it's often used in IDE "project files", like Eclipse and Visual Studio.
    • It has a model of tags and attributes, but no arrays, integers, booleans, etc.
      • But if an XML schema is specified, it may provide such features.
  • YAML is (surprisingly) the de facto control plane language of the cloud!
    • Since it also supports object serialization, unsafe deserialization is possible.
    • Although it has JSON-like types, it's arguably string-ish because it has type confusion, e.g. the boolean no vs the string "no".
    • Almost all continuous integration / build services use it: sourcehut, Gitlab CI, Github Actions, Circle CI, Travis CI, etc.
    • Google App Engine uses it for app.yaml
    • Kubernetes uses it, often with Go Templates (e.g. Helm)
    • One of the original creators wrote a blog post in 2020 with some of YAML's early history.
  • NestedText is visually similar to YAML, but "it only supports one scalar type: strings."
    • "As such, quoting strings is unnecessary, and without quoting there is no need for escaping."
    • It's one of a few formats used by Parametrize From File.
  • Some languages support a limited subset of shell variable syntax. Assignments can refer to previously-defined variables simply, or perhaps with limited extensions.

Languages For Typed Data

These languages allow you to express structured and dynamically typed data, but they aren't programmable.

  • JSON is derived from JavaScript, but it's now available in almost every language. It's used by node.js in package.json.
    • JavaScript Object Notation is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate.
    • JSON5 is an extension of JSON that allows trailing commas and comments, among other things. Used by Chromium, Next.js, WebStorm, etc.
    • JSONC a more minimal extension which only adds comments. Used by VS Code and Deno.
  • TOML looks like an .INI file, but it has JSON-like types. It's used by Rust in Cargo.toml.
    • A config file format for humans. TOML aims to be a minimal configuration file format that's easy to read due to obvious semantics.
  • p-list is a family of formats used to store serialized objects on [NeXTSTEP]-derived systems. In particular, Apple tools widely use it for configuration.
  • sdlang -- appears to be used by Oracle, Bank of America, etc.?
    • SDLang is a simple and concise way to textually represent data. It has an XML-like structure – tags, values and attributes – which makes it a versatile choice for data serialization, configuration files, or declarative languages. Its syntax was inspired by the C family of languages
  • kdl -- a close relative of SDLang with a specification and a nice looking website.
  • ron -- simple readable data serialization format that looks similar to Rust syntax

Programmable String-ish Languages

  • Go templates are for arbitrary text and HTML.
    • They are often used to generate variants of a YAML file.
  • M4 is used by autotools to generate shell and Make.
  • CMake generates Make and Ninja files.

Programmable Typed Data

  • Cue
    • Validate, define, and use dynamic and text-based data
    • The Logic of CUE -- Types are Values; The Value Lattice
    • CUE History -- Although it is a very different language, the roots of CUE lie in GCL, the dominant configuration language in use at Google. It was originally designed to configure Borg, the predecessor of Kubernetes.
    • Who uses it?
  • Dhall
  • HCL (Hashicorp Configuration Language) is used by Terraform for AWS. Similar to UCL.
    • *HCL is a toolkit for creating structured configuration languages that are both human- and machine-friendly, for use with command-line tools. Although intended to be generally useful, it is
  • Jsonnet - The Data Templating Language.
    • Generate config data; Side-effect free; Organize, simplify, unify; Manage sprawling config
  • Nickel -- influenced by Nix, but it's independent of an application.
    • Write complex configurations. Modular, correct and boilerplate-free
    • Merge; Verify & Validate; Reuse
  • Nix Expression Language is used to write Nix package definitions.
  • Pkl – "An embeddable configuration language which provides rich support for data templating and validation. It can be used from the command line, integrated in a build pipeline, or embedded in a program."
  • Starlark
    • Starlark is a [dialect of Python] intended for use as a configuration language. It was designed for the Bazel build system
  • UCG (Universal Grammar for Configuration) is designed to solve the "templating" problem for JSON, YAML, etc.
    • Templates can be difficult to manage without introducing hard to see errors in the serialization format they are generating. Most templating engines aren't aware of the format they are templating
    • Who uses it?
  • UCL (Universal Configuration Language)
    • UCL is heavily infused by nginx configuration as the example of a convenient configuration system. However, UCL is fully compatible with JSON format and is able to parse json files.
    • Who uses it? primarily targeted towards devops tools, servers, etc.*
  • ytt (YAML Templating Tool)
    • template and patch YAML and text files using Starlark for flow control
    • modular, hermetic, deterministic, side-effect free
    • guaranteed valid YAML output (handles value escaping and indentation); aware of YAML structure
    • Who uses it?
      • originally designed by VMware for Carvel — packaging, distribution, and deployment for Kubernetes

Proprietary / non-open-source: Google's BCL

Internal DSLs in General Purpose Languages

Links

Clone this wiki locally