Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix empty lines & trailing comments #278

Merged
merged 8 commits into from
Jun 5, 2021
53 changes: 20 additions & 33 deletions docs/05_content_nodes.md
Original file line number Diff line number Diff line change
Expand Up @@ -360,64 +360,51 @@ The return value of the visitor may be used to control the traversal:
If `visitor` is a single function, it will be called with all values encountered in the tree, including e.g. `null` values.
Alternatively, separate visitor functions may be defined for each `Map`, `Pair`, `Seq`, `Alias` and `Scalar` node.

## Comments
## Comments and Blank Lines

```js
const doc = YAML.parseDocument(`
# This is YAML.
---
it has:

- an array

- of values
`)

doc.toJS()
// { 'it has': [ 'an array', 'of values' ] }
doc.toJS() // { 'it has': [ 'an array', 'of values' ] }
doc.commentBefore // ' This is YAML.'

doc.commentBefore
// ' This is YAML.'
const seq = doc.get('it has')
seq.spaceBefore // true

const seq = doc.contents.items[0].value
seq.items[0].comment = ' item comment'
seq.comment = ' collection end comment'

doc.toString()
// # This is YAML.
//
// it has:
//
// - an array # item comment
//
// - of values
// # collection end comment
```

A primary differentiator between this and other YAML libraries is the ability to programmatically handle comments, which according to [the spec](http://yaml.org/spec/1.2/spec.html#id2767100) "must not have any effect on the serialization tree or representation graph. In particular, comments are not associated with a particular node."

This library does allow comments to be handled programmatically, and does attach them to particular nodes (most often, the following node). Each `Scalar`, `Map`, `Seq` and the `Document` itself has `comment` and `commentBefore` members that may be set to a stringifiable value.

The string contents of comments are not processed by the library, except for merging adjacent comment lines together and prefixing each line with the `#` comment indicator. Document comments will be separated from the rest of the document by a blank line.

**Note**: Due to implementation details, the library's comment handling is not completely stable. In particular, when creating, writing, and then reading a YAML file, comments may sometimes be associated with a different node.
A primary differentiator between this and other YAML libraries is the ability to programmatically handle comments, which according to [the spec](http://yaml.org/spec/1.2/spec.html#id2767100)
"must not have any effect on the serialization tree or representation graph. In particular, comments are not associated with a particular node."
Similarly to comments, the YAML spec instructs non-content blank lines to be discarded.

## Blank Lines
This library _does_ allow comments and blank lines to be handled programmatically, and does attach them to particular nodes (most often, the following node).
Each `Scalar`, `Map`, `Seq` and the `Document` itself has `comment`, `commentBefore` members that may be set to a stringifiable value, and a `spaceBefore` boolean to add an empty line before the comment.

```js
const doc = YAML.parseDocument('[ one, two, three ]')

doc.contents.items[0].comment = ' item comment'
doc.contents.items[1].spaceBefore = true
doc.comment = ' document end comment'

doc.toString()
// [
// one, # item comment
//
// two,
// three
// ]
//
// # document end comment
```
The string contents of comments are not processed by the library, except for merging adjacent comment and blank lines together.
Document comments will be separated from the rest of the document by a blank line.
In the node member values, comment lines terminating with the `#` indicator are represented by a single space, while completely empty lines are represented as empty strings.

Similarly to comments, the YAML spec instructs non-content blank lines to be discarded. Instead of doing that, `yaml` provides a `spaceBefore` boolean property for each node. If true, the node (and its `commentBefore`, if any) will be separated from the preceding node by a blank line.
Scalar block values with "keep" chomping (i.e. with `+` in their header) consider any trailing empty lines to be a part of their content, so the following node's `spaceBefore` or `commentBefore` with leading whitespace is ignored.

Note that scalar block values with "keep" chomping (i.e. with `+` in their header) consider any trailing empty lines to be a part of their content, so the `spaceBefore` setting of a node following such a value is ignored.
**Note**: Due to implementation details, the library's comment handling is not completely stable, in particular for trailing comments.
When creating, writing, and then reading a YAML file, comments may sometimes be associated with a different node.
2 changes: 1 addition & 1 deletion src/compose/composer.ts
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ function parsePrelude(prelude: string[]) {
case '#':
comment +=
(comment === '' ? '' : afterEmptyLine ? '\n\n' : '\n') +
source.substring(1)
(source.substring(1) || ' ')
atComment = true
afterEmptyLine = false
break
Expand Down
2 changes: 1 addition & 1 deletion src/compose/resolve-end.ts
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ export function resolveEnd(
'COMMENT_SPACE',
'Comments must be separated from other tokens by white space characters'
)
const cb = source.substring(1)
const cb = source.substring(1) || ' '
if (!comment) comment = cb
else comment += sep + cb
sep = ''
Expand Down
9 changes: 6 additions & 3 deletions src/compose/resolve-props.ts
Original file line number Diff line number Diff line change
Expand Up @@ -48,18 +48,21 @@ export function resolveProps(
'COMMENT_SPACE',
'Comments must be separated from other tokens by white space characters'
)
const cb = token.source.substring(1)
const cb = token.source.substring(1) || ' '
if (!comment) comment = cb
else comment += commentSep + cb
commentSep = ''
atNewline = false
break
}
case 'newline':
if (atNewline && !comment) spaceBefore = true
if (atNewline) {
if (comment) comment += token.source
else spaceBefore = true
} else commentSep += token.source
atNewline = true
hasNewline = true
hasSpace = true
commentSep += token.source
break
case 'anchor':
if (anchor)
Expand Down
101 changes: 47 additions & 54 deletions src/parse/parser.ts
Original file line number Diff line number Diff line change
Expand Up @@ -33,38 +33,6 @@ function includesNonEmpty(list: SourceToken[]) {
return false
}

function atFirstEmptyLineAfterComments(start: SourceToken[]) {
let hasComment = false
for (let i = 0; i < start.length; ++i) {
switch (start[i].type) {
case 'space':
break
case 'comment':
hasComment = true
break
case 'newline':
if (!hasComment) return false
break
default:
return false
}
}
if (hasComment) {
for (let i = start.length - 1; i >= 0; --i) {
switch (start[i].type) {
/* istanbul ignore next */
case 'space':
break
case 'newline':
return true
default:
return false
}
}
}
return false
}

function isFlowToken(
token: Token | null | undefined
): token is FlowScalar | FlowCollection {
Expand Down Expand Up @@ -521,21 +489,31 @@ export class Parser {
switch (this.type) {
case 'newline':
this.onKeyLine = false
if (!it.sep && atFirstEmptyLineAfterComments(it.start)) {
const prev = map.items[map.items.length - 2]
const end = (prev?.value as { end: SourceToken[] })?.end
if (Array.isArray(end)) {
Array.prototype.push.apply(end, it.start)
it.start = [this.sourceToken]
return
}
}
// fallthrough
if (it.value) {
const end = 'end' in it.value ? it.value.end : undefined
const last = Array.isArray(end) ? end[end.length - 1] : undefined
if (last?.type === 'comment') end?.push(this.sourceToken)
else map.items.push({ start: [this.sourceToken] })
} else if (it.sep) it.sep.push(this.sourceToken)
else it.start.push(this.sourceToken)
return
case 'space':
case 'comment':
if (it.value) map.items.push({ start: [this.sourceToken] })
else if (it.sep) it.sep.push(this.sourceToken)
else it.start.push(this.sourceToken)
else {
if (this.atIndentedComment(it.start, map.indent)) {
const prev = map.items[map.items.length - 2]
const end = (prev?.value as { end: SourceToken[] })?.end
if (Array.isArray(end)) {
Array.prototype.push.apply(end, it.start)
end.push(this.sourceToken)
map.items.pop()
return
}
}
it.start.push(this.sourceToken)
}
return
}
if (this.indent >= map.indent) {
Expand Down Expand Up @@ -643,20 +621,29 @@ export class Parser {
const it = seq.items[seq.items.length - 1]
switch (this.type) {
case 'newline':
if (!it.value && atFirstEmptyLineAfterComments(it.start)) {
const prev = seq.items[seq.items.length - 2]
const end = (prev?.value as { end: SourceToken[] })?.end
if (Array.isArray(end)) {
Array.prototype.push.apply(end, it.start)
it.start = [this.sourceToken]
return
}
}
// fallthrough
if (it.value) {
const end = 'end' in it.value ? it.value.end : undefined
const last = Array.isArray(end) ? end[end.length - 1] : undefined
if (last?.type === 'comment') end?.push(this.sourceToken)
else seq.items.push({ start: [this.sourceToken] })
} else it.start.push(this.sourceToken)
return
case 'space':
case 'comment':
if (it.value) seq.items.push({ start: [this.sourceToken] })
else it.start.push(this.sourceToken)
else {
if (this.atIndentedComment(it.start, seq.indent)) {
const prev = seq.items[seq.items.length - 2]
const end = (prev?.value as { end: SourceToken[] })?.end
if (Array.isArray(end)) {
Array.prototype.push.apply(end, it.start)
end.push(this.sourceToken)
seq.items.pop()
return
}
}
it.start.push(this.sourceToken)
}
return
case 'anchor':
case 'tag':
Expand Down Expand Up @@ -847,6 +834,12 @@ export class Parser {
return null
}

private atIndentedComment(start: SourceToken[], indent: number) {
if (this.type !== 'comment') return false
if (this.indent <= indent) return false
return start.every(st => st.type === 'newline' || st.type === 'space')
}

private *documentEnd(docEnd: DocumentEnd) {
if (this.type !== 'doc-mode') {
if (docEnd.end) docEnd.end.push(this.sourceToken)
Expand Down
23 changes: 0 additions & 23 deletions src/stringify/addComment.ts

This file was deleted.

28 changes: 19 additions & 9 deletions src/stringify/stringifyCollection.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import { Collection } from '../nodes/Collection.js'
import { addComment } from '../stringify/addComment.js'
import { stringify, StringifyContext } from '../stringify/stringify.js'
import { isNode, isPair } from '../nodes/Node.js'
import { stringify, StringifyContext } from './stringify.js'
import { addComment, stringifyComment } from './stringifyComment.js'

type StringifyNode = { comment: boolean; str: string }

Expand Down Expand Up @@ -35,10 +35,15 @@ export function stringifyCollection(
let comment: string | null = null
if (isNode(item)) {
if (!chompKeep && item.spaceBefore) nodes.push({ comment: true, str: '' })
if (item.commentBefore) {
let cb = item.commentBefore
if (cb && chompKeep) cb = cb.replace(/^\n+/, '')
if (cb) {
if (/^\n+$/.test(cb)) cb = cb.substring(1)
// This match will always succeed on a non-empty string
for (const line of item.commentBefore.match(/^.*$/gm) as string[])
nodes.push({ comment: true, str: `#${line}` })
for (const line of cb.match(/^.*$/gm) as string[]) {
const str = line === ' ' ? '#' : line ? `#${line}` : ''
nodes.push({ comment: true, str })
}
}
if (item.comment) {
comment = item.comment
Expand All @@ -48,10 +53,15 @@ export function stringifyCollection(
const ik = isNode(item.key) ? item.key : null
if (ik) {
if (!chompKeep && ik.spaceBefore) nodes.push({ comment: true, str: '' })
if (ik.commentBefore) {
let cb = ik.commentBefore
if (cb && chompKeep) cb = cb.replace(/^\n+/, '')
if (cb) {
if (/^\n+$/.test(cb)) cb = cb.substring(1)
// This match will always succeed on a non-empty string
for (const line of ik.commentBefore.match(/^.*$/gm) as string[])
nodes.push({ comment: true, str: `#${line}` })
for (const line of cb.match(/^.*$/gm) as string[]) {
const str = line === ' ' ? '#' : line ? `#${line}` : ''
nodes.push({ comment: true, str })
}
}
if (ik.comment) singleLineOutput = false
}
Expand Down Expand Up @@ -113,7 +123,7 @@ export function stringifyCollection(
for (const s of strings) str += s ? `\n${indent}${s}` : '\n'
}
if (comment) {
str += '\n' + comment.replace(/^/gm, `${indent}#`)
str += '\n' + stringifyComment(comment, indent)
if (onComment) onComment()
} else if (chompKeep && onChompKeep) onChompKeep()
return str
Expand Down
18 changes: 18 additions & 0 deletions src/stringify/stringifyComment.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
export const stringifyComment = (comment: string, indent: string) =>
/^\n+$/.test(comment)
? comment.substring(1)
: comment.replace(/^(?!$)(?: $)?/gm, `${indent}#`)

export function addComment(
str: string,
indent: string,
comment?: string | null
) {
return !comment
? str
: comment.includes('\n')
? `${str}\n` + stringifyComment(comment, indent)
: str.endsWith(' ')
? `${str}#${comment}`
: `${str} #${comment}`
}
Loading