-
Notifications
You must be signed in to change notification settings - Fork 37
Fixes case where optional user inputs broke computation #133
Conversation
Ray issue is due to ray-project/ray#25282 |
a234bbb
to
cd8611c
Compare
Ok, I think I see what you're getting at here. i think its OK, but not 100% sure. I guess my question is why would anyone be asking for the user-provided input as an output? Feels a little wonky, but I don't see a great reason why not... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good -- if I fully understand. Don't 100% get the original use-case this was breaking though...
added example breaking case to description. |
Ok I think I understand. Surprised we didn't catch this the first time, and it feels a little hacky, but I think its a fine solution.
Alright, yep, makes sense. Thanks! |
cd8611c
to
ba32928
Compare
The execute function gets all upstream nodes of the required node to compute. This will mean that there will likely be "user input" nodes to cycle through. When we were computing the DFS value for them, we would assume they were required. To illustrate, if you had a function that had optional input, `baz` e.g. ```python def foo(bar: int, baz: float = 1.0) -> float: ``` This meant that if you did not pass in a value for `baz`, and `baz` was a user input, Hamilton would complain that a required node was not provided. Even though it was not required for computation. So to fix that, in execute any user node is now marked with `optional`. I believe this is fine to do, because if this is not the case, there will be a node in the graph that will have `baz` as a REQUIRED dependency, and thus things will break appropriately. To help with that, I also fixed and added some unit tests. One unit test is to ensure that we don't remove passing in `None` values as part of the kwargs to the function. Since that's what we do now, and this was another way to fix this bug, which I think would be the wrong way to go about it. Otherwise I added tests to ensure that node order does not change the result too.
ba32928
to
21c007e
Compare
The execute function gets all upstream nodes of the required node to compute.
This will mean that there will likely be "user input" nodes to cycle through. When
we were computing the DFS value for them, we would assume they were required.
To illustrate, if you had a function that had optional input,
baz
e.g.And then requested
foo
from the driver, there would be three nodes to compute:foo
,bar
,baz
.This meant that if you did not pass in a value for
baz
, andbaz
was a user input, Hamilton would complain that a required node
was not provided. Even though it was not required for computation.
So to fix that, in execute any user node is now marked with
optional
.I believe this is fine to do, because if this is not the case, there will be a
node in the graph that will have
baz
as a REQUIRED dependency, andthus things will break appropriately.
To help with that, I also fixed and added some unit tests.
One unit test is to ensure that we don't remove passing in
None
valuesas part of the kwargs to the function. Since that's what we do now, and this
was another way to fix this bug, which I think would be the wrong solution.
To reproduce the bug:
Causes:
NotImplementedError: bar was expected to be passed in but was not.
Changes
Testing
Notes
Checklist
Testing checklist
Python - local testing