Skip to content

The Transformations Extension

flyx edited this page May 6, 2017 · 9 revisions

The Transformations Extension

A frequently requested feature in YAML is the possibility of making transformations on the input data via YAML syntax, for example string interpolation in scalars, concatenation of lists or replacing certain keys of a given mapping. In the past, YAML provided the so-called merge-tag, which was poorly specified and a violation of YAML's core concepts (questions arise like is a merge key valid in a mapping which has an explicit tag that only allows for certain keys not including the merge key?).

The specification of the merge tag has not been updated for YAML 1.2, but it is supported by some implementations and requested for others that do not support it. This extension uses annotations to define a viable alternative whose behavior is well-defined for all use cases, which plays well with YAML's other features, and which covers a wider range of use cases.

Definitions

  • All annotations defined within this document are called transformation actions.

  • A transformation action shall be executed upon its node. Every action takes in one YAML node and produces one YAML node. Actions are allowed to change the type of a YAML node, for example, if a sequence node has a transformation action, the processed result may be a mapping node.

  • All transformation actions defined in this document must be implemented by every conforming implementation. An implementation may provide additional transformation actions defined elsewhere, and may also allow the user to add transformation actions via its API.

  • For YAML documents to be less verbose, some transformation actions defined in this document define an additional one-character shortcut. An implementation may choose not to support those one-character shortcuts, but must state so clearly in its documentation.

  • In this document, whenever properties, content or structure of a node are discussed, if said node is an alias, the node referenced by that alias is the node that is talked about. That means that an implementation of this extension must implement alias resolution.

  • Some actions defined here need to check YAML nodes for equivalence (using the terms equal or same). This equivalence shall be defined as follows: Two YAML nodes are equivalent if

    • they have the same type (scalar, mapping or sequence) and
    • they are structurally equivalent, i.e. for scalars, the contents must match; for sequences, the length must match and all elements with the same index must match; and for mappings, each key must have an equal key in the other mapping and each value must be equal to the value of the same key in the other mapping.

    This is a recursive definition. Mind that tags and anchors are not taken into account as they are resolved at a later stage of loading; so e.g. two scalar nodes !!str 42 and !!int 42 are considered equal. This is necessary because else, we would not be able to tell whether 42 and !!int 42 are equal – since implicit tag resolution may resolve the first scalar's implicit tag to !!int, but only at a later stage.

Transformation Actions

@concat

@concat must only be used on a YAML sequence and has the shortcut @c. All items of that YAML sequence must share the same node type; their tags are ignored. @concat produces a node of the same type as the type of the sequence's items.

  • If the sequence contains scalars, the result will be a scalar whose contents is the string-concatenated value of its items.
  • If the sequence contains sequences, the result is the concatenation of all sequence items.
  • If the sequence contains mappings, the result is a merged mapping which holds all key-value pairs of all input mappings. If the input mapping keys are not disjoint, this will result in an error.

Examples:

# input:
--- @concat
[ Hello, ", ", World! ]
# result:
---
Hello, World!
#input:
--- !!intlist @c
- [1, 2, 3]
- [4, 5, 6]
# result:
--- !!intlist
[1, 2, 3, 4, 5, 6]
#input:
---
base: &base
  one: two
  three: four
child: @concat
- *base
- five: six
# result:
---
base: &base
  one: two
  three: four
child:
  one: two
  three: four
  five: six

@interpolate

@interpolate must be used only on scalars and has the shortcut @i. It will replace certain parts within the scalar contents. Parts to be replaced must start with a $. There are three possible formats:

  • /\$\$/ will always be replaced by a $ (which is then part of the content and will not be further processed).
  • /\$([A-Za-z_]+)/ and /\$/{([A-Za-z_]+)\}/ will be resolved as an alias to the anchor named like the first capturing group. The anchor must be on a scalar node. The match shall be replaced by the contents of that scalar node.

Examples:

# input:
---
- &hello Hello
- &world World
- @i "$hello, ${world}! $$"
# result:
---
- &hello Hello
- &world World
- "Hello, World! $"

@merge

@merge must be used only on sequences and has the shortcut @m. The sequence must contain only mappings. The result will be a mapping which contains all key-value pairs of all the input mappings – but if a key is replicated within a later mapping, any pair with the same key of any former input mapping is thrown away. If you want to disallow this, use @concat instead.

Examples:

# input:
---
base: &base
  one: two
  three: four
actual: @m
- *base
- three: five
  six: seven
  eight: nine
- eight: one
# result:
---
base: &base
  one: two
  three: four
actual:
  one: two
  three: five
  six: seven
  eight: one

@get

@get must only be used on sequences having exactly two items. The first item must be a mapping, the second item may be any node. @get searches for a key in the first item that equals the second item and returns its value. If the key isn't found, this shall lead to an error.

Examples:

# input:
--- @get
- foo: bar
  baz: spam
- baz
# result:
---
spam

@for

@for must only be used on sequences having exactly three items. The first item must be a sequence, the second item must be a scalar, and the third item may be any node. The result is a sequence whose length is identical to the first input item. Each item in the result sequence is a copy of the third item, wherein all aliases named like the second item are replaced by an item from the first input sequence, such that the ith item of the output sequence uses the ith item of the input sequence as replacement.

Examples:

# input:
--- @for
- [ one, two, three ]
- val
- @i "Go fetch me $val beer!"
# result:
---
- "Go fetch me one beer!"
- "Go fetch me two beer!"
- "Go fetch me three beer!"

You can use @for together with @concat for building mappings:

# input:
--- @merge @for
- [ one, two, three ]
- val
- *val: Some value
# result:
---
one: Some value
two: Some value
three: Some value

@for can also be used together with @get to iterate complex values:

# input:
--- @for
- [ {forename: Karl, surname: Koch}, {forename: Peter, surname: Pan} ]
- val
- @c ["Hello, ", @get [*val, forename], " ", @get [*val, surname], "!"]
# result:
---
- Hello, Karl Koch!
- Hello, Peter Pan!
Clone this wiki locally