-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Auto-propagated function parameters #2327
Comments
Scala is the only language I know of with this feature set. It's generally (ab)used to provide things like The only scenario where this even initially seemed fine in one of the codebases I maintain was to pass a logging parameter through the callstack, but even when doing that it's occasionally confusing to get weird error messages about it being missing or not being found. Besides, I don't think there's enough research and language design experience that suggests that this is consistently beneficial to implement it in a language with a production focus. |
If I understand the example correctly, you still have to write
Which to me at least looks almost identical in readability and ergonomics. Perhaps writing the In other words, I'm just not seeing the motivation for this. |
@ahmedcharles I apologize for my lack of experience with Scala, I only know it at a basic level. Briefly looking at the Scala docs, there are two similar features around
@ahmedcharles I'm not exactly making up this problem. The java webservices I've built for production uses has this pattern (issue?) all over the place. I don't like these options, but a lot of people today (over the last decade) end up using either thread local storage or "request scoped" Dependency Injection (DI), which internally uses thread local storage, to "improve ergonomics". In java they even try building "method interceptors" called aspects (I'm not a fan), again using thread local storage. All of these are intricate and nuanced workarounds for this relatively simple, missing compile time feature. You might even call this feature "compile time, strongly typed, dependency injection". Hopefully this helps alleviate your concerns around "enough research"? @Ixrec yeah an "options" struct is a great alternative, thanks I'll add it to my proposal for visibility. However, while it can help improve the ergonomics, it does end up being kind of awkward and also limiting. I blame myself for being lazy and only adding one example. Not all calls will need all of the information. Passing around an options struct, doesn't allow you to write functions idiomatically (i.e. limit variable scope as much as possible). Keep in mind middleware might be written in different crates, not exactly able to share a common options struct. For example: // Imagine the database client is a plugin to a web framework.
// It only knows about `Query`s and `Metrics`, not about `User` or `Request`.
// A single struct would not be known to the database client implementor.
// Framework/Library author implemented
// `database.call` = fn call(query: Query, with { metrics: Metrics })
// Service owner implemented
// fn query_builder1(with { user: User })
// fn query_builder2(with { user: User, request: Request })
// fn build_response(result1: ..., result2: ..., with { user: User, request: Request })
fn handle_request(with { request: Request, user: User, metrics: Metrics }) {
let result1 = database.call(query_builder1());
let result2 = database.call(query_builder2());
...
return build_response(result1, result2);
} You could just take my word for it, or look at most web applications, these Maybe this showcases the benefits of this proposal more? If so, I'll update my original post with these examples / motivations. |
@Ixrec thinking about it a little more, I do think you could use an options struct, with a lot of |
If you look at the Scala example for implicit parameters, you'll note that the value of the implicit is purely based on matching types. I actually don't know what happens if there are two implicits declared in scope with the same type, though I hope it results in an ambiguity error. Basically, the names of the parameters are irrelevant. You seem to be proposing that the names are important, but that basically results in adding named parameters to the language, which hasn't been done so far due to complexity. Requiring that the names match makes the feature less flexible and useful (because everyone has to agree on using the same names, or it ends up being more verbose than the current syntax). But alternatively, basing it purely on type results in the chaos (my word) seen in Scala. Let's take your handle request example and make it more complete. // Imagine the database client is a plugin to a web framework.
// It only knows about `Query`s and `Metrics`, not about `User` or `Request`.
// A single struct would not be known to the database client implementor.
// Imagine that you're using JWT auth to extract user information.
// I'm going to ignore that this would probably be middleware for the moment,
// but the problem is the same.
// Framework/Library author implemented
// `database.call` = fn call(query: Query, with { metrics: Metrics })
// Service owner implemented
// fn query_builder1(with { user: User })
// fn query_builder2(with { user: User, request: Request })
// fn build_response(result1: ..., result2: ..., with { user: User, request: Request })
fn handle_request(with { request: Request, metrics: Metrics }) {
// let's pretend we have a way to make user act like it was a with parameter
let with { user } = jwt::User::from_bearer(request.auth_header());
let result1 = database.call(query_builder1());
let result2 = database.call(query_builder2());
...
return build_response(result1, result2);
} This is all nice and fancy, but now the I suppose it's obvious now why Scala ignores parameter names. But it still requires the same types. In an ecosystem like Rust's where there are lots of small crates that integrate with each other, primarily using traits, I don't really see this feature being beneficial. As soon as you start mixing libraries, passing variables is the clearest solution and other than being verbose, I don't really see the problem. |
Haskell supports a feature like this under -XImplicitParams so there's certainly prior art and research (Lewis2000, https://dl.acm.org/citation.cfm?doid=325694.325708) around this subject. My opinion of that language pragma has always been that it is a misfeature that optimizes for writing ergonomics to the detriment of reading ergonomics. As reading code happens more frequently than writing, I believe we should default to favoring the ergonomics of reading over writing whenever there is a conflict, unless there is a strong argument for an exception in the particular case. To me, the legibility of code will be the major drawback of this proposal should it come to fruition. |
@Centril I'm entirely in agreement. My preference is always to optimize for readability. I'd take this one step further, as much as possible we should optimize for re-readability. I'm not familiar with this features' implementation in Haskel or why it might be a miss feature there. I think wether this feature is a misfeature is highly dependent on what you believe is signal and what is noise. There are two places this feature can effect readability, a function's signature, and a function's implementation. The feature I'm requesting only changes the function signature marginally, as such I don't think there is an ergonomic win or loss while reading function signatures. The rest of my comment will cover the readability impact of this feature in function implementations. A common starting point: I think we can both agree, Similarly, in some problem domains, while reading over the implementation of a particular function, certain variables are noise, not signal. Reading them over and over doesn't tell me more about the purpose of the function. I'll de-sugar my earlier example: fn handle_request(with { request: Request, user: User, metrics: Metrics }) {
let result1 = database.call(query_builder1());
let result2 = database.call(query_builder2());
...
return build_response(result1, result2);
} vs fn handle_request(request: Request, user: User, metrics: Metrics) {
let result1 = database.call(query_builder1(user), metrics);
let result2 = database.call(query_builder2(user, request), metrics);
...
return build_response(user, request, result1, result2);
} I'm not able to contrive an adequately involved example, but note that Additionally, if you asked me, "whats the purpose of the
When it comes to groking a function for the 2nd, and especially 15th time, I would call these types of parameters which have repetitive, predetermined usage patterns, noise. I'd postulate, that while I'm not familiar with them, there exists a similar set of "mechanically passed" parameters in every codebase which shares a problem domain (i.e. anywhere you'd build and share a framework). |
As an overly broad rule, syntax-level tricks bring confusion while type-level tricks bring power and clarity. I think a measure of type inference plus dependent types provides a much better route to this :
We'll only have const dependent types in near future, but one could imaging improving that to constant-relativet-to-the-local-call-stack, no-interior-mutability, etc. dependent types or whatever. In practice, you actually find yourself reaching for implicit parameters not to avoid typing, but so that people cannot pass the wrong thing as easily, so dependent types provide one stronger solution, but so do other type system extensions: Improving ephemeral types ala structural records aka anonymous
You can actually enforce invariants with design by contract or other verification tools, which frequently sounds better. In this vein, your |
@burdges I do not mind if we use the type system over an AST transform.
Passing I think the type inference would be the blocker, but if the lang/compiler team feels its the better route, and can achieve the same effect, I'm happy to change this to go with that solution instead. |
As for my thought with the RFC, I personally would like to get some runtime on seanmonstar's idea before deciding we need to extend the language and in what way. Granted, I'm always a bit conservative about adding language features due to concern with bloating the language (one of the things that turned me off from D). For background, my use case of a "Context" is I maintain the Rust implementation of the Liquid templating language. It has "global" state in both the parser and the renderer and it'd be nice to find better ways to express that rather than passing those structs up and down the call stack. Things I think this RFC could use
I assume we can't auto-wire by name alone because then the types might be invalid. Do we need auto-wiring to match name and type to ensure we don't miswire things? |
I want to echo @Centril's sentiment about how this RFC seems to optimize for writing ergonomics rather than reading. I'd like to see stronger value to outweigh that. What this RFC is primarily competing with is an explicit context argument. If this is a struct (or anonymous struct at some point), it can evolve with the library or application that uses it without having to refactor all call sites. For the places where it's changed we have similar unpacking/packing problems as we would have with this RFC.
In the Java world this was IMO mostly unsuccessful because there wasn't stable language support for it, leading to pains when integrating AOP into the real world.
I agree with this. I just want to clarify that this RFC wants to add context-capabilities to any function (if I'm reading it correctly), while the proposal above is limited to receivers. Because of this, it doesn't propagate well across methods (this comment by withoutboats). That is a potential benefit that this RFC has compared to that proposal. |
I think the example shown here has moved the conversation into an area that I personally don't find that useful where @seanmonstar's proposal can also fit. However the place that I find most interesting is context variables that can appear out of thin air by a previously unknown piece of code. eg: think of it more as Because of asyncio Python recently gained context variables which are variables that are just generally available but can have different values per task: https://www.python.org/dev/peps/pep-0567/ There are a lot of uses for this:
Such a system can also come in useful when tests are used. For instance right now println! can be redirected temporarily for the test runner alone on a per-thread basis through an undocumented and unstable API. With a context system one could imagine a world where global default streams can be tapped in on a per task basis. I made an earlier proposal to Python where I imagined a system where the context variable can define what happens if a new task is spawned out of a task (does the value get copied, cleared etc.). I figured this is useful as there are typically two ways to spawn a task: a task that aids another task (http fetch / db query during web api http request handling) or a task that spawns a whole context by itself (http request for instance). There needs to be a generally well understood API to spawn such things differently or the contexts will all get messed up. There is an almost neverending list of complexities for this and I can see if I can find some of the notes I wrote. I also know that the node community didn't manage to settle on a new system yet but did leave some notes behind as well about how such a system could work. In the process of working on this I noticed that there are quite a few systems that can be used for inspiration (.NET's logical call contexts and nodejs' domains and various libraries working with the async hooks and now Python 3's context variables). //EDIT: one thing I forgot to mention is that it's I think very important that new data can be made up by 3rd party libraries into existing contexts. This is a crucial requirement for systems such as Sentry or auditing code. |
What happened to "explicit over implicit"? Are we not doing that any more? |
@sandeep-datta Type inference makes things implicit, lifetime elision also. I think the key is to be explicit where it matters most for clarity and implicit where it reduces pain the most... You might be interested to read @withoutboats's excellent blog post on this subject. |
@Centril Lifetime elision isn't really implicit. The compiler isn't allowed to make a choice based on information that isn't local to the function definition and there's only one possibility. Granted, people may not know what that is in terms of having the lifetimes in the source, but then again, those same people wouldn't be likely to understand the lifetimes if they were visible in the source either. This feature results in any function call within the definition of a function with N implicit parameters having 2^N possible ways of being called, ignoring the order of the parameters. I don't want to do the math with ordering being a consideration. 1 vs 2^N isn't really comparable, unless N is 0. |
I think that's the wrong way to think about it. Well implemented context variables are just a superior version of |
I strongly disagree with any kind of implicit argument passing in any form or fashion. What is wrong with passing the damn arguments around? I cannot imagine any code review going faster thanks to implicit parameters.
I disagree with that. All three existing features you mentioned require something to be written where the code is impacted, be it a question mark or a method call. Implicitly passing arguments means I need to always remember what even was the function signature and what were its implicit arguments. Furthermore, it can make code way more confusing, if there is a function with implicit params which is named the same thing as a trait method with a different arity.
Accessing environment variables from the middle of code is asking for problems, just like rampant use of global variables. |
To provide another piece of terminology that might help people understand this proposal: Essentially we are proposing adding dynamic scope. Scala implicits are a syntactically different but semantically identical example of dynamic scope. Environment variables are another example of dynamic scope. Exception handlers are typically dynamically scoped. Dynamic scope is very useful - many languages have exceptions/exception handlers, and environment variables are widely used for configuration. Dynamic scope can be used to implement thread-locals/coroutine-locals in an elegant, unified way. But: dynamic scope can be easily abused, and probably should not be used outside of foundational libraries. So it's a bit of a toss-up. |
See my comment above but the simple answer is that it's impossible. The goal of a context system is to make things which are currently global state (think of things like stdout, environment variables, execution flags such a if tracebacks should be shown etc.) slightly more local. Traditionally TLS is used for this, but this cannot work with async systems. More importantly though we need to be able to compose code. Different systems will allow you to pass different data through. Even the best designed system will require TLS sooner or later to capture vital state. Security contexts in multi tenant systems for auditing are a really good example of something which is just too important to pass around. Same thing with logging integrations in general. I encourage you to look at how logical call contexts in .NET work to get an idea what they are used for. Likewise why the node community has domains and similar systems. They are vital for the operation of the raven-node client for instance. (Any api that currently uses TLS needs to use a different system to work in an async environment.) |
How can it be impossible, given this issue is about sugar that corresponds to passing arguments explicitly? Am I missing something? |
@nox i was talking about the need for context variables, not about syntactic sugar for implicit argument passing which I'm opposed to. |
@nox I'm thinking that this RFC isn't discussing the underlying motivations for the feature--to its detriment. @mitsuhiko described them a while up: How do we make a new cross-cutting feature, like logging, auditing, collecting metrics, etc work with an existing library without needing to update the function signatures of the entire library for each feature? |
Also, after thinking about it, would a feasible solution be using a builder pattern that constructs a context , and move all of the functionality of the api to methods on the context? (And then wrap this for cross-cutting needs?) |
By updating the function signatures of the entire library, such that you can see the code flow and not be completely lost 3 months in the future. |
@epage I brought up aspects and my dislike for them in my second post. I'm just trying to batch a bunch of ideas, and I'll update the proposal. Basically, regarding aspects. I do not like aspects because they end up needing TLS, and while I don't have a problem with TLS in general, I do have an issue when it is used as a crutch for a missing feature, and is possibly misused. I'll update the proposal with my thoughts. @mitsuhiko I don't mind if we use Also, I do prefer if every function in the call stack (even if it doesn't need the context) have to declare the existence of the context. This is purely for compile time correctness, and based on Rust's rule of not doing global checking. But I could be convinced otherwise.
@nox I actually don't mind "passing the damn arguments around". I would prefer if it was more ergonomic :), but I end up using this "passing arguments" style in all of my production Java projects. That said, this isn't about you or me, I know that other people view this "passing ... around" as a code smell, and the conversations not only get contentious, but most developers go looking for any solution which might help (i.e. request scoped DI, AOP, manual TLS), and I've had to fix unforeseen consequences of this type of code. I don't love it, but I could be convinced to introduced something like JavaScript's spread operator, |
@yazaddaruvala The damning was unnecessary, sorry for that. For what it's worth it was about them plain old arguments and not the people wanting to omit them. :) In my experience on Servo, at no point we ever wished for implicit argument passing, but we did just group around a bunch of arguments multiple time in some sort of A small thing I often do is select an identifier in my editor and let it highlight all other occurrences to see where it's defined and used. This is also one of the key features of searchfox.org. Implicit argument passing breaks that. |
@nox This makes a lot of sense to me, and I see more why you guys are pushing back. If you're not dealing with frameworks, you wouldn't run into this pain. The work you've done for the Additionally, within a single application its relatively easy to build a single Context struct shared by all. However, with a framework, you use types from the framework (i.e. not controlled by you), plus types you've made (i.e. not controlled by the framework author), plus types from middleware (i.e. types not controlled by either the framework author or you). Creating a single |
I worked 6 years for a dating website before I worked on Servo, I did not feel pain because I was missing implicit argument passing.
This applies to Servo, you would be surprised by how many independent parts there are in it.
Again, this is entirely the same with implicit argument passing except that you can't track the code flow just by reading it. There is nothing that implicit argument passing can do that explicit argument passing can't, as per the topic discussed in this topic. If you are actually arguing about something else, you should probably open a different issue. |
Rocket uses function decorators to validate the user when handling a HTTP request. Perhaps Yazaddaruvala can use a similar approach to solve the issue in the RFC - But it may only compile on nightly. |
And that website / service did not use thread local data? I find that hard to believe. In PHP effectively everything is thread local. Node uses domain information excessively behind the scenes. Django and Flask in the Python world are using TLS behind the scenes. Ruby on Rails uses TLS as well.
That does not work if data does not travel through the entrypoint. A helper function internally cannot depend on data that is not injected in the entrypoint. For instance an i18n system in rocket needs to have an injector in every view function and then pass an i18n object to all functions throughout the codebase manually. |
The things you cited correspond to code using globals, which, yet again, doesn't correspond to implicit argument passing. Could we focus on the merit of implicit argument passing, which by definition is syntactic sugar for something that corresponds exactly to passing arguments around explicitly, and not talk about similar-yet-unrelated features which should go in their own RFC or issue? |
So to avoid us talking about different things:
You replied to a comment about me arguing that implied function argument passing is not a sufficient solution for actual problems (need for context variables) but in a way that I was assuming you are arguing against the need of context variables / TLS supporting async tasks altogether. If that's not the case then please ignore my comments. |
As an aside, you can always do mutable implicit parameters with a state monad anyways, except maybe not your missfeature
You'll want a In other words, you should attempt to do this using coroutines, almost as if it were an |
At minimum, actually doing this via a state monad with coroutines/generators providing the |
Either I'm hopelessly confused, or most of this discussion is about hypothetical ideas that have very little in common with what originally was or currently is in the proposal at the top of the thread. Unfortunately this design space is one where it's almost impossible to fairly evaluate an idea without some concrete code samples, and all the samples presented here so far are only convincing me that most of these ideas are either not even a net win for write ergonomics, or are making a small win write ergonomics at a substantial cost in read ergonomics. I say "most" because there's a handful of comments that, at least to me (I've never looked into AOP), are so vague I can't even form enough of a strawman proposal in my mind to ask specific questions about them. In particular:
What is a "context system" that's not just using a bunch of globals or passing a context argument around? I can imagine helpful sugars or conventions on top of those two options, but nothing that changes the fundamental trade-offs. |
I've compiled all of the feedback, and updated the original proposal. Changed the motivation, extended it to show the current solutions, why they are not adequate. I've added extra example and alternate proposed solutions to achieve the same end result, without using the same mechanism. |
What about function call sites? You don't always need to read the function signature if you can infer it from the function name and arguments used at the call site. If I had to read the signature for each function call in a body of code it will soon get very tiresome. This is an ergonomic setback for readability. |
Please consider your tone here. It is not very cordial. I assume @yazaddaruvala has legitimate reasons for pursuing this and considers this to be an improvement even if only for writing. Reasonable people can disagree over what the trade-offs are and where we land on scale. I believe if nothing else some interesting ideas have been spread here and discussed in this topic and we are all richer for it as people. |
I am sorry that came off a little harsh. I have removed the offending sentence now. |
"Whenever there is a difference of opinion between [the writers of code and the reader of code], the readers are always right and the writers are always wrong." - 10min into Yaron Minsky's Effective ML lecture Said differently, any utility derived from making code shorter actually comes entirely from making the code more comprehensible, i.e. more readable, even say Implicit parameters can only make code more readable when used in constrained ways, like DSLs or @mitsuhiko list. It's not the implicitness per se that makes implicit parameters a missfeature but lack of constraints that tell the reader what is going on. There are however several type system approaches like do-notation for state monads, dependent types, and structural records that afaik provide any readability improvement you might gain through implicit parameters. I only named fancy ones so far, but simply using
After writing that, I now realize you're likely fighting the borrow checker here, so actually NLL will largely solve your problems:
I'll point out that Haskell's FRUs are more terse so that last line becomes Also, I'll point out that if you wanted another |
You might also ask if there is much sense in destructuring
|
I just noticed the |
@burdges this thread is pretty massive, but you (and others) might be interested in jumping back to this comment which discusses that approach. |
No, no, no, please don't do this. It will be extremely confusing and hard to read. Passing around context parameters is not hard. I've done it several times, and it didn't hurt… it was like a 1 minute task to complete. On the other hand, I would be very unhappy if I had to read code that had who knows how many implicit name bindings who knows where, given especially that Rust is very lenient when it comes to shadowing, so this has the most realistic potential to introduce (or hide the existence of) serious bugs. |
@yazaddaruvala Just a note on the RFC procedure: If you want to file a formal RFC to be considered by the language team for inclusion into Rust, this needs to be done as a pull request to this repository. Check out some merged and closed RFC to get a sense of what the process will look like. |
Ping from triage: @yazaddaruvala, are you intending to make a formal RFC out of this? |
This is a really old RFC and a lot has changed in the Scala world which a lot of the Taking a Scala 3 example and pseudo translating to Rust:
|
I've never created an RFC or proposal before, some mentoring would be appreciated!
Summary
The purpose of this proposal is to provide an ergonomic yet explicit way to get access to a set of variables deep in the call stack, or for a lot of sibling calls which need the same value. You may even call this feature, "compile time, strongly typed, dependency injection", or "context variables". This particular proposal is about an implementation based on the addition of an AST transform to give Rust some syntax sugar to achieve this.
Motivation
There are a lot of times you want to propagate a function parameter deep into the call stack or to a lot of child calls at the same level in the stack.
The goal of this proposal is to provide Rust as it is today, with some ergonomic wins. There should be minimal/zero drawbacks to this syntax sugar. Ideally the syntax would be extremely intuitive, if not, less than a page of documentation should be needed.
Ergonomic Goals
There are two places this feature can effect readability, a function's signature, and a function's implementation.
Function signatures:
These change marginally, as such I don't think there is an ergonomic win or loss while reading function signatures.
Function implementations:
A common starting point: I think we can all agree,
for x in ...
(vs it's legacy counterpart) is far more signal (i.e. more readable), even tho how we accomplish the iteration is no longer readable. This is because the purpose was to iterate, not to increment registers and dereference arrays.Similarly, in some problem domains, while reading over the implementation of a particular function, certain variables are noise, not signal. Reading parameters which are mechanically passed over and over doesn't help me grok the purpose of the function.
Additionally, if you asked me, "In a web service, whats the purpose of the request variable if you can't see where it is used?"
Current solutions
env::var
, orstdout
.User
objects stored in TLS and then not cleaned up before the next requestContext
objects which encapsulate all the state neededExplanation
Note: Good syntax is very important, but for this iteration, I am going to use placeholder syntax. We can discuss syntax after, if the general concepts of this proposal are more or less discussed and well received.
Functions will have special syntax to define auto-propagated params and with it an auto-propagation scope. Auto-propagated params with identical names and types will be automatically propagated by the compiler to any sub-call within scope. Functions within an auto-propagation scope, can have any or all of their auto-propagated params overridden. When called without an auto-propagation scope (usually at the top of the call stack) overrides can be used to set the initial auto-propagation scope.
The best way to explain this is with example sugar/de-sugar variants.
The real ergonomic wins start showing in large code bases. As such no example can really do it justice.
Open Questions
Drawbacks
At initial glance the syntax sugar seems like magic. However, a single page of documentation would highlight a de-sugared example, and it will be very straight forward / intuitive for all. This is similar to a first encounter with
?
orInto
orself.___()
.database.call(query_builder(_), _);
database.call(query_builder(...), ...);
New syntax for "special" parameters. Users could be overwhelmed.
Excessive use of generics can be an issue regardless of parameters being implicit, but this issue gets more-likely as we create more parameters for functions.
This particular syntax (we can fix this when we discuss syntax):
FAQ
Alternatives
@burdges suggested that we could possibly use something similar to dependent types and type inference. I'm happy to explore this route if everyone prefers it. Although it likely will limit the context to have unique types.
@burdges has also suggested a possible
State
monad with the possible inclusion ofdo
notation.@mitsuhiko has a suggestion, which I understood as TLS with a tighter scope. Basically instead of "thread local storage", it could be "call stack local storage" (my name). His proposal seems to suggest access through a global like function e.g.
env::var
. I would still like to put the function's dependencies into the function signature somehow. And ideally, get some compile time checks, but without adding the context to all function's signatures, this is compile time check is impossible because of "no global inference".Conclusion
Adding a relatively simple AST transform would open Rust up to some pretty nice and easy ergonomic wins. The syntax would need to be discussed and could bring in more drawback, but at least for now the drawbacks seem minimal and within acceptable bounds.
I'd love to hear other people's thoughts:
The text was updated successfully, but these errors were encountered: