-
Notifications
You must be signed in to change notification settings - Fork 5
RFC 003
Tina Müller (tinita) edited this page May 1, 2017
·
13 revisions
Tests: ...
The 1.2 spec productions for a YAML anchor allow too many characters to be in a anchor name. Specifically, it's a bad idea to allow YAML syntax characters in an anchor name.
These are the 1.2 productions:
c-ns-anchor-property ::= “&” ns-anchor-name
ns-anchor-name ::= ns-anchor-char+
ns-anchor-char ::= ns-char - c-flow-indicator
ns-char ::= nb-char - s-white
nb-char ::= c-printable - b-char - c-byte-order-mark
c-printable ::= #x9 | #xA | #xD | [#x20-#x7E] /* 8 bit */
| #x85 | [#xA0-#xD7FF] | [#xE000-#xFFFD] /* 16 bit */
| [#x10000-#x10FFFF] /* 32 bit */
b-char ::= b-line-feed | b-carriage-return
b-line-feed ::= #xA /* LF */
b-carriage-return ::= #xD /* CR */
c-byte-order-mark ::= #xFEFF
s-white ::= s-space | s-tab
s-space ::= #x20 /* SP */
s-tab ::= #x9 /* TAB */
c-flow-indicator ::= “,” | “[” | “]” | “{” | “}”
This reduces to:
c-ns-anchor-property ::= “&” ns-anchor-name
ns-anchor-name ::= ns-anchor-char+
ns-anchor-char ::= [#x21-#x7E] | #x85 | [#xA0-#xD7FF]
| [#xE000-#xFFFD] | [#x10000-#x10FFFF]
Certainly #x85
and #xA0
do not belong as they are whitespace.
Neither do !
, #
, :
, *
, &
and many other punctuation characters.
Anchor names don't need to be that expressive and shouldn't look like they mean
something else to YAML.
Anchor names should effectively be all the unicode "word" characters. This could mean (using Perl regex semantics):
\p{Letter} | \p{Number} | '-' | '_' | '/'
This change should be made in YAML 1.3. Most real world usage is
[A-Za-z0-9_]
, so we should tighten this down now.
- Verify assumptions
- Determine all code points in the Perl char classes above.
@perlpunk would like to add the dot. &a.1