-
-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion on graph shape #717
Comments
My suggestion is to write a more general optimization, one which will work both for static expressions and for dynamic expressions which stop changing mid-way through the execution. Let's detect when a sub-graph stops changing, and then let's rewrite the nodes around it so that it looks as if that sub-graph was always part of the original graph. I think this optimization fits better with the existing optimization when Stream() indicates that it will no longer send values downstream. |
Because the functions themselves are the easy part, they are the base case! The more interesting part of the analysis is to look at an expression which returns a function, and to determine whether it is constant or not. |
You were right, I have changed my mind about this: let's generate the optimized sub-graph right away rather than rewrite the graph at runtime! The reason I changed my mind is because I now see that if we do, then we can use the same static analysis for this feature and for the "send a function to a resource" feature. This static analysis determines if a given function or expression is "timeless", meaning that it doesn't do shenanigans like sending more than one value downstream or waiting until it receives multiple inputs before sending that single output. There are several variants to this question:
Note that it is possible for a timeless Expr to produce a non-timeless FuncValue! For example, Shell is a timeless constant expression which produces a single FuncValue, but that FuncValue is not timeless. And for higher-order functions like Map, an expression calling Map on a constant list and a constant function might be timeless, meaning that it emits a single list and then stops. But what if this list contains FuncValues? This list might contain a mix of timeless and not timeless FuncValues! I think we should add an The Value returned by
Sounds like a plan? |
I'm not sure if those specific shenanigans matter, do they?
I think the simplest thing might be the easiest thing... Any function we statically label at design time can never be allowed to be part of the static graph generation. IOW,
I didn't 100% understand this part, but it might be best to discuss in person today. I'm excited! Cheers |
Currently, The optimization is to instead compile this directly to a ShellFunc. You are right that it doesn't matter whether ShellFunc does some shenanigans or not; what matters is whether the ConstFunc does. This is why I am careful to distinguish between whether a Expr is timeless and whether the FuncValue it produces is timeless. |
Indeed, when writing a static analysis, we usually need to be conservative and to only optimize the programs which we can guarantee are safe to optimize, and then some more complex programs won't be optimized even though in theory they could be. The reason we have to be conservative is because at compile time, we don't have all the values yet, so we have to pessimistically assume that the values we get at runtime are those which would break the optimization. In this case, however, we are lucky because the static analysis is determining which expressions are constant. And those are precisely the expressions for which we do have the value at compile-time, so we don't have to compromise! |
Sounds good to me =D I've updated git master with some of the easy things I could already merge, and I've rebased my feat/graph-shape with the changes I said I would make... I put lots of XXX in places where you should look at things. Cheers! |
I have now pushed answers to those |
Okay, I took your branch, and did the following:
This last commit is probably completely wrong, but it was the only way I could make the types work. I'm not sure how I would take a full.FuncValue and turn it into a CallableFunc any other way. Maybe you can shed light here and I can do the plumbing if need be. Lastly, and just for reference, instead of CheckParamScope, we can actually use the Cheers! |
The commit is indeed wrong. The idea is that we want a The next step is to go through all the remaining places where a In particular, it would be important to find a way for the many builtins which happen to be timeless, like addition, to use |
We left off here...
What was the advantage of doing that rather than just not generating a subgraph via Txn to begin with for the nodes that won't ever need it? IOW, what is the downside of just generating those function graphs to be the same shape as they currently are in git master? (For the non-lambda functions of course.)
We can add a field to function Info to say whether or not they will need Txn or not. I don't mind manually labelling each function. It's actually easy, the default can be false, and then we have a single function that needs labelling as true so far (iter.map).
The text was updated successfully, but these errors were encountered: