-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syntactic sugar for distinguishing value from closure arguments #513
Comments
If we do not bother checking "prototypes" at call sites then this is a fairly simple feature to implement in the parser/compiler. We can't actually check prototypes at call sites anyways, since we have no easy way to distinguish a "generator" expression from an expression that would produce just one value as desired. |
Given that this is just syntactic sugar, that we can't actually do anything interesting with this at call sites, I think this should be closed. But perhaps we could adopt this for documentation purposes. I.e., document @stedolan What think ye? Leaving this open for now. |
Actually, there is something we can do at the call sites (at link time): we can check that the closure being passed to a |
Hmm. Why do we need this distinction? There's nothing wrong with passing closures. There are essentially no contexts in jq where an "expression that produces exactly one value" is what's needed. In general in jq, So, passing closures to |
@stedolan I've been finding that closure arguments often confuse users. Certainly a way to document that a given [closure] argument to a given function is intended to produce a single value... would be nice. Sure, we can say so in the function's description. But a syntactic way to do the same would be even better. Also, having a short-hand that captures one value from the closure and binds it to a Finally, if we could statically detect and warn about generators passed where only one value will be consumed, that would be even better. Basically, I love that all function arguments are closures, potentially generators, but it's not that easy to explain. Also, I myself often typo argument separator |
...
My point is that there shouldn't be any of those. There should be no functions that operate on "one value", only those that operate on one value at a time, and produce as many results as their arguments produce. The function shouldn't "intend" that its argument produce only one value. It shouldn't care how many values the argument produces, and just produce a result for each value.
The |
@stedolan - I'd just like to point out that this thread began as the result of a previous discussion regarding the distinction between "regular filters" and "special filters", the difference between the two being as follows, taking the case of arity-1 filters for simplicity. Let us define f/1 to be regular if for all generators, s, (My example of a "regular" filter was The point I was trying to make -- and the point I'd ask you to consider -- is that neither the name of a filter nor the way in which it is called makes it obvious whether it is regular or special. This may have been by design, but some programming languages make it easy (or even mandatory) to distinguish between "functions" and "macros" at the point at which they are called, e.g. by some naming convention. So far as jq is concerned, I realize the cat is already out of the bag. Still, for new "special" builtins, it would be possible to adopt a naming convention. I am not advocating "!" specifically, but it wouldn't be too late to rename "limit" to "limit!", in that [EDITED to correct definition of "regular".] |
@stedolan We do have a few such functions. Check out When we need to pass multiple values to a function the only reliable way to do it is by collecting them into an array or object to pass as the input of that function. It looks weird, and in particular it's weird to newbies. Explaining that function arguments are closures, in a way that newbies are likely to understand and keep in mind, is hard (though you've done a much better job of it in the manual than I could have). And in practice it's not difficult to use single-value producing closures as function arguments, but it takes care, and I'm thinking we could do more to a) signal an interface's contract to use a single value, b) document it, and c) [with some limitations] check this at compile/link time. |
@stedolan I do agree that we could require parens around closure expressions that use commas. I'd thought of that but it seemed a bit hacky; if you're OK with it though, then I am as well. |
@pkoppstein |
@nicowilliams -- Not knowing what the alternatives really are, I just gave "!" as an example since it does seem to be an actual possibility. But "!" would not be a bad choice -- it connotes surprise! Pay attention! Be careful! Leading underscores already have another significance, and perhaps trailing underscores should be left for users to play with. "@" already has found a niche in jq. C uses UPPERCASE, and I suppose that would be a possibility, but between these possibilities, my preference would be "!" (at least as I write :0). Are there any other likely candidates? As for regex filters -- as I recall, they all take their "target" string argument from input, precisely so that one can present a stream of target strings to them. What other realistic possibility is there? Maybe I've totally missed your point, but it almost seems as though you've momentarily forgotten that in jq it's a feature that |
@pkoppstein The modifiers, for example, in the regex functions, are expected to be single-valued. |
@pkoppstein Also, what What should Nor is it the case that the closures must be used as generators. They can be filters (e.g., |
@nicowilliams wrote:
Yes, but so what? I think you missed the point about |s| * |t| outputs for test/2, for example, already has the desired characteristics:
@wtlangford and @stedolan might like to chime in, but I think @wtlangford got it exactly right. |
@nicowilliams asked:
Well, that's the jq way -- here, there, and just about everywhere. I can almost here Alec McGuiness intone: "Embrace the streams, Nico." Are you suggesting that we should go back to 'test([ s, flags])'? |
@pkoppstein Ah, yes, we do get cartesian product right now when passing two or more generators to a C-coded function. I know that |
I'll close this. |
@pkoppstein your distinction between "special" and "regular" functions is exactly right. There's no syntax to distinguish them, and I think the only sensible approach is to try and limit the number of "special" functions as much as possible. @nicowilliams yep, match($patterns[]; $modes[]) should be a cartesian product. Everything should be a cartesian product unless there's a really good reason not to. I regret that there are a few more "special" operations than there really need to be, which makes things confusing. I'm not sure whether "select" should really be special, or the |
There's only two jq-coded builtins that take more than one argument in master right now. and both are irregular but easily fixed to be regular:
This patch makes them work more regularly in @pkoppstein's sense:
I'll push it. Then we should look at 1-argument builtins that could be doing something similar but aren't. |
We should inspect:
EDIT: |
See #521. |
BTW, I'd like to think of "regular" and "special" by relation to Lisp. "regular" will be like any defun in Lisp, and "special" will be like a macro or special form in Lisp. Thus Before I stepped all over jq the number of special forms was smaller, and some had special syntax (e.g., Something like:
which would be equivalent to:
It'd be syntactic sugar to help avoid #521 and it'd help document what is special about any one def. |
@nicowilliams I like that proposal much, much more than the original one. |
@stedolan But it's really just syntactic sugar, so don't expect it anytime soon :) (OTOH, it's probably very easy to implement!) |
… uniq(stream) The primary purpose of this commit (which supercedes PR jqlang#2624) is to rectify most problems with `gsub` (and also `sub` with the "g" option), in particular jqlang#1425 ('\b'), jqlang#2354 (lookahead), and jqlang#2532 (regex == "^(?!cd ).*$|^cd ";"")). This commit also partly resolves jqlang#2148 and jqlang#1206 in that `gsub` no longer loops infinitely; however, because the new `gsub` depends critically on match(_;"g"), the behavior when regex == "" is sometimes non-standard. [*1] Since the new sub/3 relies on uniq/1, that has been added as well [*2]. The documentation has been updated to reflect the fact that `sub` and `gsub` are intended to be regular in the second argument. [*3] Also, _nwise/1 has been tweaked to take advantage of TCO. Footnotes: [*1] Using the new gsub, '"a" | gsub( ""; "a")' emits "aa" rather than "aaa" as would be standard. This is nevertheless better than the infinite loop behavior of jq 1.6 in this case. With one exception (as explained in [*2]), the new gsub is implemented as though match/2 behavior is correct. That is, bugs in `gsub` behavior will most likely have their origin in `match/2`. [*2] `uniq/1` adopts the Unix/Linux name and semantics; it is needed for the following test case: gsub("(?=u)"; "u") "qux" "quux" Without this functionality: Test jqlang#23: 'gsub("(?=u)"; "u")' at line number 100 *** Expected "quux", but got "quuux" for test at line number 102: gsub("(?=u)"; "u") The root of the problem here is `match`: if `match` is fixed, then gsub would not need `untie`. The addition of `uniq` as a top-level function should be a non-issue relative to general concern about builtins.jq bloat: the line count of the new builtin.jq is significantly reduced overall, and the number of defs is actually reduced by 1 (from 111 (ignoring a redundant def) to 110). [*3] See e.g. jqlang#513 (comment)
should be equivalent to:
The text was updated successfully, but these errors were encountered: