Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider removing all the "valid but discouraged" commentary from the spec #749

Closed
JoeUX opened this issue Jun 20, 2020 · 11 comments
Closed

Comments

@JoeUX
Copy link

JoeUX commented Jun 20, 2020

Ahoy. There are nine instances of the word "discouraged" in the spec, spanning at least four topics / injunctions. For me, they're a noteworthy distraction. I suggest removing all of them and just having the spec express its opinions in its specification, not in commentary.

One instance is about "out of order" keys. What do we care how someone orders their keys? Keep in mind that the spec for a markup or config language can only reach so far into space-time. These are text files meant to be consumed by humans and machines. How they're consumed by exogenous systems might ultimately matter more than a specification around their formatting. (This insight is probably more relevant to the purported type system, which in a real sense is unenforceable since these are just text files and systems that consume TOML will have to map to their native types.)

Another instance offers no explanation. It's this bit about integers:

int5 = 1_000
int6 = 5_349_221
int7 = 1_2_3_4_5     # VALID but discouraged

What do we care? I wouldn't allow underscores in integers to begin with, but if the spec allows them between each digit, what do we care if someone uses them thusly? Further, it's confusing to say "discouraged" without any explanation. I suggest that the spec either mandate the usual grouping pattern, as in int5 and int6, or continue to allow any pattern, without commentary.

In general, I'm suggesting that the spec just behave like a spec, absent all these "discouraged" asides. A mild reason for this would be to keep the spec as short and clear as possible, balanced against other goals. A major reason for this is a philosophy of baking any opinions into the spec itself, rather than in distracting asides and commentary.

@ChristianSi
Copy link
Contributor

I'm against that. In general, the spec uses "discouraged" for things that are bad style and hence better avoided. Trying to outlaw bad style is a huge and, in general, hopeless enterprise. It's better to just give recommendations on good style or good practice as opposed to bad style or bad practice, and "discourage" the latter – which is what the spec does. Also, turning any of these discouragements into prohibitions is not possible before TOML 2.0, since it would break compatibility.

@abelbraaksma
Copy link
Contributor

abelbraaksma commented Jun 21, 2020

I kinda like these notes in the spec, though a future edition might have a style guide or "do's and don'ts" section, so that this is in one location.

About the integers, that's perhaps the one place where it's a bit misguided to have said comment. In several countries it's, for instance, common to use digit grouping by two (India, if I'm not mistaken). And sometimes sensible grouping is just logical, but irregular, for instance with guids (but 128 bits integers are not supported out of the box, so this is a bit of a non issue now), phone numbers, or social security numbers.

@abelbraaksma
Copy link
Contributor

abelbraaksma commented Jun 21, 2020

Actually, Indian uses a mix, see https://en.m.wikipedia.org/wiki/Decimal_separator#Digit_grouping. And in other Asian countries apparently using myriads is not uncommon.

For instance, 10_00_000 would be 1 million in India.

It's important that TOML embraces the international community by being inclusive (see also the discussion on Unicode literals), and thus not limiting this type of formatting. I believe that to be a good thing, though anybody can always adopt a different style guide,or create a parser that's less inclusive if they so choose.

@JoeUX
Copy link
Author

JoeUX commented Jun 21, 2020

Good point about India. Cultural differences are a strong argument against discouraging different forms and preferences.

I do like the Go formatting standard though, so that source always looks the same and is more easily compared. Not sure how to apply that here though. I'm also surprised that TOML key names have to be ASCII. Seems like an arbitrary and dated preference.

@abelbraaksma
Copy link
Contributor

abelbraaksma commented Jun 21, 2020

Seems like an arbitrary and dated preference.

Not arbitrary, but certainly dated. That's why I mentioned the literals (and with it, key names) discussion, which is precisely about that. Just forgot where it was, but it'll be one of the first things to go in vNext after 1.0 is final and some have already adopted it in their parsers.

I do like the Go formatting standard though

Some love it, others hate it, just witnessed a long discussion between the two groups. I guess it's a personal preference whether you prefer a language that's overly strict or not. But don't confuse TOML with a programming language, it's not ;).

@marzer
Copy link
Contributor

marzer commented Jun 21, 2020

I'm also surprised that TOML key names have to be ASCII. Seems like an arbitrary and dated preference.

Proposal to rectify that post-1.0: #687

@ChristianSi
Copy link
Contributor

@abelbraaksma's comment on digit grouping certainly is a good argument for removing this one discouragement from the spec:

int7 = 1_2_3_4_5     # VALID but discouraged

We could instead replace it with a more appropriate example of how underscores in numbers might reasonably be used, say:

int7 = 10_00_000  # Indian-style number grouping

@JoeUX
Copy link
Author

JoeUX commented Jun 22, 2020

@abelbraaksma Got it, thanks.

I don't confuse TOML for a PL, though it's interesting to think about a strict format for a data serialization / markup language. Especially one that has a twin binary form, like Amazon Ion or Protocol Buffers. Text formatting should probably just be automated anyway.

@ChristianSi ChristianSi mentioned this issue Jun 22, 2020
6 tasks
@pradyunsg
Copy link
Member

pradyunsg commented Jun 23, 2020

The TOML specification is NOT purely a implementation-oriented specification -- it's certainly meant to be read by someone who writes a TOML file, and these comments are directed toward pushing for a certain degree of consistency.

I don't think we'll dropping any of the notices about "discouraged" forms. I view them as being useful guiderails and none of them are bad suggestions. I'm sure there could be some disagreements, but those are inherent in discussions about "poor form" / style.


That said, there are 2 actionable items in this discussion that I'll file follow-up issues for:

  • adding 1_00_00_000 as an example for Indian style number separation, in integers.
  • changing "strongly discouraged" to "not possible" (for breaking inline tables across multiple lines)

@pradyunsg
Copy link
Member

I'm going to go ahead and close this issue now, since I don't think we should remove the commentary. None the less, thanks for filing this issue @JoeUX.

Also, thanks everyone who's participated in this discussion. ^>^

@abelbraaksma
Copy link
Contributor

@pradyunsg, thanks for making the changes, I totally agree with your motivation :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants