Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify whitespace and newline rules. #264

Merged
merged 3 commits into from
Dec 17, 2014
Merged

Clarify whitespace and newline rules. #264

merged 3 commits into from
Dec 17, 2014

Conversation

mojombo
Copy link
Member

@mojombo mojombo commented Nov 11, 2014

This PR greatly clarifies the use of whitespace and newlines in TOML. A few notes and ramifications:

  • Behavior on Windows is noted.
  • Key and table names cannot contain any whitespace, even around brackets or dots.
  • Keys and dot-separated table parts follow the exact same rules.


```toml
[table]
key = "value"
```

You can indent keys and their values as much as you like. Tabs or spaces. Knock
yourself out. Why, you ask? Because you can have nested tables. Snap.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you delete these two lines?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indenting is covered by the previous statement that "Whitespace is ignored around key names and values.", so I thought it to be redundant. Also, the ability to indent is a weird way to segue into nested tables, and makes it sound as if indentation might carry some semantic value.

@ChristianSi
Copy link
Contributor

I like the definition of what is and what isn't allowed in keys and table name (parts).

A question about the newline handling: this reads to me as if the newline conventions used on the system where a TOML file is created become a permanent part of multiline strings. So assuming the following string comes from a file using Windows newlines:

"""I'm a multiline
string."""

it would be byte-for-byte equivalent with

"I'm a multiline\r\nstring."

Assuming a TOML processor running on Unix reads that file and then creates an updated version, using Unix newlines, this seems to imply it would be forced to serialize the string as

"""I'm a multiline\r
string."""

or

"I'm a multiline\r\nstring."

Not sure if that's indeed the meaning of the spec. But if it is, I must say that I find it quite unintuitive and possibly annoying.

I think it might be better to say approximately the following:

  • On reading, every occurrence of CRLF(0x0A 0x0D), standalone LF or standalone CR is converted into a newline.
  • On writing, every newline is converted into the newline representation conventionally used on the platform where the TOML generator is running, i.e. either LF (Unix-style) or CRLF (Windows-style).

That's approximately what Python's "universal newline mode" does.

@mojombo
Copy link
Member Author

mojombo commented Nov 12, 2014

cc @BurntSushi - I'd love your opinion on this.

@mojombo
Copy link
Member Author

mojombo commented Nov 12, 2014

I should also mention that depending on the outcome of #220 the key/table name rules may change. This PR is primarily to get the newline issue nailed down.

@@ -62,6 +62,7 @@ Spec

* TOML is case sensitive.
* Whitespace means tab (0x09) or space (0x20).
* Newline means CR (0x0A) or LF (0x0D).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if Newline was defined as either \r\n or \n. This leaves out a lone \r as qualifying as a new line, but I think this OK, unless it's still commonly used somewhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

\r was once used on Mac, but Mac OS X changed that AFAIK.

@mojombo
Copy link
Member Author

mojombo commented Dec 1, 2014

Ok, I've made a few changes to define newline as LF or CRLF and tightened up the copy to reflect that change. cc @BurntSushi @ChristianSi - I'd love your thoughts on the new text.

@mojombo
Copy link
Member Author

mojombo commented Dec 1, 2014

@ChristianSi Ok, just pushed a small edit for that.

@BurntSushi
Copy link
Member

LGTM. :-)

@mojombo
Copy link
Member Author

mojombo commented Dec 17, 2014

Ok, thanks for your feedback on this, everyone. Merging!

mojombo added a commit that referenced this pull request Dec 17, 2014
Clarify whitespace and newline rules.
@mojombo mojombo merged commit 520b851 into master Dec 17, 2014
@mojombo mojombo deleted the newlines branch December 17, 2014 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants