Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should positionals be opt-in when strict: true? #85

Closed
bakkot opened this issue Mar 19, 2022 · 40 comments · Fixed by #116
Closed

Should positionals be opt-in when strict: true? #85

bakkot opened this issue Mar 19, 2022 · 40 comments · Fixed by #116
Labels
enhancement New feature or request

Comments

@bakkot
Copy link
Collaborator

bakkot commented Mar 19, 2022

Elsewhere I've mentioned I think the goal of strict: true should be that you can write a well-behaved script just by calling parseArgs with your options configured, with no further validation required (except validation of the values of value-taking options, which is out of scope). Here by "well-behaved" I mean it will surface an error to the user if they try to use the script in a way the author did not intend.

Right now, if I'm writing a script which does not expect positional arguments, and I do the obvious thing of

let { values } = util.parseArgs({ options });

but the user passes a positional argument, I'm never going to notice, and the user's input is going to be silently dropped. And it's very likely that I'll end up shipping this, because it's probably not going to occur to me to try passing a positional argument and so I will never notice this problem.

That seems contrary to the goal above. So I'd like to suggest having a positionals configuration option, which must be explicitly opted into (when strict: true). For people who want positionals, this ought to be a small hiccup: they'll notice right away that passing a positional argument doesn't work. And for people who don't, they'll get the right behavior. So neither group is at risk of shipping a script which is not well-behaved.

(Good errors help here: e.g., "this script has not been configured to take positional arguments" is clear to users that they're misusing the script, and to authors who want to take positional arguments that they need to configure it to take positional arguments.)


Another advantage of having this piece of information is that it lets you improve error messages. If the parser encounters an unknown flag --foo, and the script author has explicitly opted in to positionals, the error message can suggest that if the user intended to pass --foo as a positional argument they can do so by putting -- --foo at the end of the command invocation. But you wouldn't want to suggest this in a script which isn't using positionals.

@ljharb
Copy link
Member

ljharb commented Mar 19, 2022

That seems pretty reasonable - maybe a “positionalArgs” option, that can be true, or an integer?

@bakkot
Copy link
Collaborator Author

bakkot commented Mar 19, 2022

Would the integer be a minimum, a maximum, or a precise number of positional arguments to expect? None of those are very satisfying answers, I think.

I suppose you could allow specifying both a minimum and a maximum, at the cost of even more config knobs.

@ljharb
Copy link
Member

ljharb commented Mar 19, 2022

I’d expect a maximum, but yes, it could instead be true, or an object with min/max integer properties.

@shadowspawn
Copy link
Collaborator

Quick comment. Good description! However, I don't like adding configuration to use basic features of the parser, so still thinking about it.

@bakkot
Copy link
Collaborator Author

bakkot commented Mar 19, 2022

However, I don't like adding configuration to use basic features of the parser

Well, the other half of that is that absent this config options, scripts which don't want positionals but still want to be well-behaved will need to add extra logic to use the parser at all: const results = parseArgs({options}); if (result.positionals.length > 0) throw new Error('this script does not take positional arguments');.

So one of the two groups (expecting-positionals and not-expecting-positionals) is inevitably going to have to add something, to get the behavior they want and be well-behaved.

Not being able to configure this amounts to deciding that the not-expecting-positionals group is the one which will need to add something. Which is fine, in general, but my concern is that in this particular case that group is not even going to notice they needed to add something to be well-behaved, and consequently ship a poorly-behaved script, which is (I think) a bad outcome.

(And, separately, having this bit would allow better error messages, which I think is valuable.)

@shadowspawn
Copy link
Collaborator

Here by "well-behaved" I mean it will surface an error to the user if they try to use the script in a way the author did not intend.

This is a thinking-out-loud comment, not fully formed. I like the concept, but I am concerned it is a slippery slope of diminishing returns. At what point do we say no?


A deliberately over-stated description of "well-behaved", but where do we draw the line?

  • We just need one more piece of configuration for situation x, then we could detect and display an error. And it has to be on by default or does not achieve "well-behaved" by default. So it isn't just a config knob, it is a required config knob for usage not covered by the default or not able to have a default.

  • And we should block some standard functionality, because it allows us to detect more errors. (Like blocking --foo VALUE.)

The extreme end-point is a safe "well-behaved" subset of functionality that requires extensive configuration and does not behave like other programs. And tutorials will say "use strict:false to simplify setup"!

Ok, please resume normal levels of cynicism and scepticism.

@ljharb
Copy link
Member

ljharb commented Mar 20, 2022

Why must there be a point where we say no? Zero config strict mode means that any time there’s something we discover is a footgun, we’d want to make the default behavior safe, even if that means common cases would require more configuration.

@shadowspawn
Copy link
Collaborator

Would the integer be a minimum, a maximum, or a precise number of positional arguments to expect? None of those are very satisfying answers, I think.

I suppose you could allow specifying both a minimum and a maximum, at the cost of even more config knobs.

Some common use cases are:

  • no positionals. e.g. ls
  • an exact number of positionals. e.g. copy source dest
  • variadic, possibly with a minimum e.g. print *.js

@shadowspawn
Copy link
Collaborator

Zero config strict mode means that any time there’s something we discover is a footgun, we’d want to make the default behavior safe, even if that means common cases would require more configuration.

I don't think "zero config" belongs in that sentence. 😉

@shadowspawn
Copy link
Collaborator

Why must there be a point where we say no?

Ok, I'll bite.

Because there isn't agreement there is demand or a need for a bullet-proof argument parser built into node. That wasn't in the previous proposal, or the initial spec here, or the current spec.

Because each additional piece of safety configuration offers the potential to detect one more footgun, but makes the library harder to get started with, or less convenient for the end user. As the starting configuration gets more intimidating, the chances people start from a boiler-plate starting configuration which bypasses the protections increase. Or people use strict:false. Or people go elsewhere and the adoption of the library is lowered.

I care about the cost-benefit ratio. I think there is often a trade-off between safety and usability and some protections are not worth the penalties.

To be clear, I opened the issue about whether there is going to be a strict mode at all, and I would prefer to include a strict mode in the first implementation. But not bullet-proof.

@bakkot
Copy link
Collaborator Author

bakkot commented Mar 20, 2022

I am concerned it is a slippery slope of diminishing returns. At what point do we say no?

Yeah, I agree this slope is pretty slippery. I think the heuristic I would use for strict: true is something like:

  • if authors in the most common scenario would reasonably expect the validation to happen (e.g. any type: 'string' option is given a value), that validation should definitely happen
  • if authors in some common scenario are unlikely to notice they need to do extra validation to meet my definition of well-behaved, that validation should probably happen by default (this issue)
  • other kinds of validation should not happen by default, or for that matter at all (e.g. validation that the values for particular options parse as numbers: because script authors must parse strings to numbers themselves, it should be obvious they need to verify that the format is as they expect)

The third thing, in particular, means that there are always going to be scripts which are not well-behaved by my definition without the author performing additional validation. That's OK, as long as it's reasonably obvious to authors that they are going to need to do that validation themselves.

I think having a boolean for positionals is the right balance on this issue. It gets rid of the sharp edge of "it didn't occur to me to check for and reject positionals" (which is a very easy mistake to make). And once you've explicitly opted in to getting an array of positional arguments, hopefully at that point you're going to think to check if you got too few or too many (if those concepts are relevant to your application). Some people still won't do those checks, but you can't save everyone.

I could see an argument [heh] for having minPositionals and maxPositionals, both defaulting to 0. I'm not advocating that, to be clear, although I'm fine with that as a resolution to this issue. But I personally can't think any other kinds of validation beyond that (other than those already planned) which would fall into either of my first two bullet points, which makes me feel better about stopping at exactly this point on the slippery slope, and no further.


This of course hinges on author expectations, and in particular how many people we think are going to run into a given sharp edge. I think it's going to be pretty common to write applications which aren't expecting any positional arguments, so I think that's a scenario we should weight pretty highly.

There's always going to be someone out there who is surprised by, say, the fact that --foo=false gives them the string false, but I think that's a more minority case - there's just not that many applications where this behavior would result in a bug and the author probably isn't going to think check for it. So I don't think that needs to be validated by default, even though there will be some number of scripts which end up being not-well-behaved for lack of validation here.

@shadowspawn
Copy link
Collaborator

Nice heuristic. Thanks for taking the time to lay it out!

I am up for an opt-in boolean for positionals.

There's always going to be someone out there who is surprised by, say, the fact that --foo=false gives them the string false,

If foo is type:boolean I think this will be an error in strict mode, which of particular note addresses the --explode=false concerns raised in previous proposal discussion.

@shadowspawn shadowspawn added discussion enhancement New feature or request and removed discussion labels Mar 27, 2022
@shadowspawn
Copy link
Collaborator

What is this opt-in property named? Trying a couple of other variations:

const { values, positionals } = parseArgs({ strict: true, positionals: true, options: knownOptions });
const { values, positionals } = parseArgs({ strict: true, allowPositionals: true, options: knownOptions });
const { values, positionals } = parseArgs({ strict: true, expectPositionals: true, options: knownOptions });

@bakkot
Copy link
Collaborator Author

bakkot commented Apr 2, 2022

I would call it positionals, I think. allowPositionals is kind of long to type.

@shadowspawn
Copy link
Collaborator

I still prefer a longer name. A few reasons:

  • you only have to type it once (won't be used multiple times in code)
  • clearer what it does when reading code
  • I am uneasy about using same name for properties on input and output

@bcoe
Copy link
Collaborator

bcoe commented Apr 9, 2022

I think I've been nudged towards the argument that options should default to strict=true. From an API perspective, it seems pretty natural and easy to explain that, by default, you will need to provide config for each option that you want to accept (a nice side-effect being that it eliminates ambiguity between boolean and string options).

I'm supportive of the allowPositionals configuration, for @bakkot's use case. But I tend to agree with @shadowspawn that I'm worried about adding another config setting individuals will frequently need to set for a very common use-case.

My preference would be allowPositionals defaulting to true.

@aaronccasanova
Copy link
Collaborator

+1 for allowPositionals and defaulting to true

@bakkot
Copy link
Collaborator Author

bakkot commented Apr 9, 2022

It's not very useful to have the option if the default is to allow positionals. After all, if a script author remembers that they need to explicitly disallow positional arguments for their particular application, they can write if (result.positionals.length > 0) throw new Error(`unexpected argument ${JSON.stringify(positionals[0])}`) or whatever. It doesn't make much difference whether they write this or do allowPositionals: false; it's actually one of the easier kinds of validation to do manually. My concern is specifically that people won't remember that they need to check: if you are writing a script that doesn't take positional arguments at all, you're probably not going to be thinking about or testing the possibility that a user might try to give you one.

That is, I think that making positionals opt-in is in keeping with the "you will need to provide config for each option that you want to accept" design. I regard positionals as being a kind of option that you want to accept. If we don't accept this point of view, I don't think there's much benefit to have a config option to disallow them at all, since a script author can do that themselves trivially.

@shadowspawn
Copy link
Collaborator

I also think either allowPositionals: false by default and need to opt-in to positionals, or not at all.

And I am ok either way.

@bcoe
Copy link
Collaborator

bcoe commented Apr 10, 2022

Shall we just take this feature to a vote with 👍 (allowPositionals defaulting to false), 👎 (launching without allowPositionals).

@bcoe
Copy link
Collaborator

bcoe commented Apr 13, 2022

@shadowspawn, @iansu, anyone else want to chime in ☝️, otherwise I'd say we go the route @bakkot is suggesting by a narrow margin (2 👍, to 1 👎 ).

@bcoe
Copy link
Collaborator

bcoe commented Apr 13, 2022

@bakkot @shadowspawn, looks like we might not quite make 18.0.0, unless address blockers ASAP, see:

nodejs/node#42262 (comment)

If there's a chance it addresses our last open point of discussion, I'm pro allowPositionals. I also don't want to rush us, if we don't think it's realistic to get a release out the door tomorrow.

Worst case scenario, we land in an early minor release of Node 18, and people can start to rely on parseArgs in Node 19/20.

@shadowspawn
Copy link
Collaborator

shadowspawn commented Apr 13, 2022

I think it would be a rush for tomorrow. We have not landed any of strict yet.

I suggest we do not attempt this now, rather than rush a decision. Reasoning:

  • if we add allowPositionals later we explicitly break people, which is better than...
  • if we take allowPositionals out later, we silently change behaviour and stop detecting usage errors
  • we haven't landed strict yet

Of historical interest, it used to be very hard to check for excess positionals in Commander including by client. It irritated me but not many issues touched on it. I added support for detecting excess with the intention of turning it on by default, but broke a number of my own unit tests and decided to make it opt-in to ship with less churn for existing users. I know that is an argument each way, but my point is that the lack of checks did come up much as a reported issue.

@shadowspawn
Copy link
Collaborator

I suggest we do not attempt this now, rather than rush a decision.

To be clear, I suggest this issue misses out on MVP. Not that we delay MVP.

@bcoe
Copy link
Collaborator

bcoe commented Apr 13, 2022

@bakkot are you okay with the compromise of landing this post MVP, the advantage of trying to get in for Node v18.0.0 (_not to overpromise, as we might already be too late for this.) is we don't need to spend the next year addressing bugs where we explain to people that parseArgs wasn't available until Node 18.x.y.

@ljharb
Copy link
Member

ljharb commented Apr 13, 2022

That doesn’t seem like that big a deal to me - it’ll just be listed in the docs.

@bakkot
Copy link
Collaborator Author

bakkot commented Apr 13, 2022

I think you'd know better than I would the costs and benefits of introducing an additional restriction like this after shipping. If you think a restriction like this can be reasonably added in a post-MVP version, that's fine with me. If not - if you think people are going to depend on getting positionals, such that node wouldn't want to break those people by making positionals opt-in - well, I don't know how serious a cost it is to miss 18.0.0, so that's hard for me to weigh that against never having this restriction.

Certainly it's not the end of the world if this gets left out forever, though. It'll be another thing many consumers need to always remember to validate lest their application silently drop input, to the annoyance of their users. But we can't eliminate all of those cases, so having another big one, while in my view it would be quite unfortunate, isn't fatal.

@bcoe
Copy link
Collaborator

bcoe commented Apr 13, 2022

I think you'd know better than I would the costs and benefits of introducing an additional restriction like this after shipping.

I think there's a tradeoff, my experience with adding code coverage to Node.js, which landed in a minor release, was that the feature didn't see significant adoption until the next major releases -- I think people are hesitant to deliver an application that will only work with a certain version range of runtimes.

However, I do want to be careful not to force util.parseArgs over the finish line of we don't think it's ready. I think I'm more concerned about #104, than this thread leading to a breaking change (since we've marked the feature as experimental and, furthermore, adoption will be gradual).


My concerns about landing in a minor release of Node.js might be out of date. People building GitHub actions with Node.js would be able to immediately benefit from the feature, this is a newish application of Node.js that I expect this feature will be popular in.

@ljharb
Copy link
Member

ljharb commented Apr 13, 2022

In my experience, pre-LTS minors of node 18 just won't matter; it's only the first LTS minor that people are likely to be "stuck" on. I think as long as we land in node 18 prior to LTS, we're perfectly fine, and I'd prefer to land with as full a featureset as we think is warranted (including this issue).

@bakkot
Copy link
Collaborator Author

bakkot commented Apr 13, 2022

I'll defer to y'all's judgement on this.

@bcoe
Copy link
Collaborator

bcoe commented Apr 13, 2022

In my experience, pre-LTS minors of node 18 just won't matter; it's only the first LTS minor that people are likely to be "stuck" on. I think as long as we land in node 18 prior to LTS, we're perfectly fine, and I'd prefer to land with as full a featureset as we think is warranted (including this issue).

I'd rather we not rush, let's take our hat out of the running for 18.0.0 -- I don't think we'll realistically be happy with our audit of prototype pollution by EOD.

And this frees us up to work on this feature as well. Thoughts @shadowspawn?

@shadowspawn
Copy link
Collaborator

shadowspawn commented Apr 14, 2022

I think the timezone difference means this is a done deal with parseArgs landing after 18.0.0!? [edit: I added the "after", missing in the first draft and sorry for any confusion if you saw that!]

I abstained from the vote, as I am ok either way. To be clear:

  • I am willing to ship without this. Ever. And we have no new people asking for it from the PR viewers.
  • I am willing to contribute to this. I may get more enthusiastic about it from working on it and seeing it solidify! But if it doesn't come together then I will push for it to be abandoned.

[edit: sorry if that was a bit blunt!]

@shadowspawn
Copy link
Collaborator

shadowspawn commented Apr 14, 2022

A common approach for ideation is to go wide, then come narrow down again with consensus, and can focus on what is left. So for narrowing, are we agreed that:

  • one optional boolean configuration flag
  • strict:true unless opt-in: get an error if args include positional
  • strict:true with opt-in to positionals: any positionals in args are returned and validation of positionals left entirely to author

Please vote to gauge consensus.

@shadowspawn
Copy link
Collaborator

shadowspawn commented Apr 14, 2022

This one is was a suggestion [now retracted]. Making up the last case as I write it!

  1. strict:false + no config: no error, no effect
  2. strict:false + opt-in: no effect (i.e. like strict:true, sure have all the positionals you want!)
  3. strict:false + opt-out (!): author usage error. Keep the story simple, strict:false means bring-your-own-validation. Not an expected setting, but it can be good practice not to rely on default behaviours. With no validation, it would be actively misleading to look like you were going to get validation so block this combination.

Edit: 1 and 3 conflict a bit. Retracting this suggestion. Perhaps better the configuration is just ignored completely in strict:false

@shadowspawn
Copy link
Collaborator

shadowspawn commented Apr 14, 2022

Naming: shall the opt-in configuration be named allowPositionals?

(Note: I am against positionals, so that is not an alternative that will get unanimous support!)

Please vote to gauge consensus.

@shadowspawn
Copy link
Collaborator

I think the code will be easy. Where it gets interesting is documenting this so it delivers batteries-included benefits to the people not using positionals (as eloquently described by @bakkot), and does not prove a pain point for people who do wish positionals (a concern to @bcoe and me).

@bcoe
Copy link
Collaborator

bcoe commented Apr 14, 2022

Naming: shall the opt-in configuration be named allowPositionals?

I also prefer the more descriptive allowPositionals.

I abstained from the vote, as I am ok either way. To be clear:

I'm also okay either way at this point, it seemed like we had one more vote for adding this feature than not.

@bakkot it sounded like you'd be willing to implement this feature? why don't you create a PR, and we can reference it in the PR to merge parseArgs into Node.js.

If no one makes a strong argument against allowPositionals on the open PR, I'm open to landing your recommendation.

We're calling parseArgs experimental, if user feedback says they don't like allowPositionals we can change our mind about the feature in the future.

@bakkot
Copy link
Collaborator Author

bakkot commented Apr 14, 2022

@bcoe Happy to make a PR, but it might be a day or two.

@shadowspawn
Copy link
Collaborator

Is there a desired timeframe @bcoe ? (I am guessing perhaps no fixed deadline, just as soon as convenient?)

@bcoe
Copy link
Collaborator

bcoe commented Apr 14, 2022

Is there a desired timeframe @bcoe ? (I am guessing perhaps no fixed deadline, just as soon as convenient?)

No set deadline, but it can get annoying keeping an upstream PR in sync for too long, mid next week might be a good goal for MVP into Node.js?

We'll have an opportunity to do some follow up PRS before 18.1.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants