Skip to content
This repository has been archived by the owner on Jul 3, 2023. It is now read-only.

Implements opinionated decorator lifecycle #28

Merged
merged 8 commits into from
Feb 6, 2022
Merged

Conversation

elijahbenizzy
Copy link
Collaborator

@elijahbenizzy elijahbenizzy commented Dec 8, 2021

This adds to the set of decorators that can be applied to a node,
and puts them in a uniform setting. The ordering is provided in
function_base.resolve_nodes -- the algorithm is as follows:

  1. If there is a list of function resolvers, apply them one
    after the other. Otherwise, apply the default function resolver
    which will always return just the function. This determines whether to
    proceed -- if any function resolver is none, short circuit and return
    an empty list of nodes.

  2. If there is a list of node creators, that list must be of length 1
    -- this is determined in the node creator class. Apply that to get
    the initial node.

  3. If there is a list of node expanders, apply that. This must be a
    list of length one. This gives out a list of nodes.

  4. If there is a node transformer, apply that. Note that the node transformer
    gets applied individually to just the sink nodes in the subdag. It subclasses
    "DagTransformer" to do so.

  5. Return the final list of nodes.

Additions

  • Transformer node

Removals

Changes

  • Refactors decorator lifecycle to be clearer/more flexible

Testing

Screenshots

If applicable

Notes

Todos

Checklist

  • PR has an informative and human-readable title
  • Changes are limited to a single goal (no scope creep)
  • Code can be automatically merged (no conflicts)
  • Code follows the standards laid out in the TODO link to standards
  • Passes all existing automated tests
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future todos are captured in comments
  • Project documentation has been updated (including the "Unreleased" section of the CHANGELOG)
  • Reviewers requested with the Reviewers tool ➡️

Testing checklist

Python

  • python 3.6
  • python 3.7

@elijahbenizzy elijahbenizzy force-pushed the decorator-refactor branch 4 times, most recently from b7a7681 to f5a1ba7 Compare December 9, 2021 04:52
@elijahbenizzy
Copy link
Collaborator Author

elijahbenizzy commented Dec 9, 2021

TODO:

  • Add transform to documentation
  • Ensure defaults/optional can be used with the transform superclass and are wired through correctly
  • consider replacing eval for expression with a limited set of operators (arithmetic)? Or allow for an unsafe parameter?
  • Rethink the API slightly -- better to have @transform_to("foo=foo+bar") or @transform_to("foo+bar")?
  • use the processed function name (E.G. with __ removed) rather than the unprocessed name

Copy link
Collaborator

@skrawcz skrawcz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what's net new here? Just the transform decorator? Suggestion on name: augment. Transform is a bit overloaded a term.

Otherwise the decorator lifecycle is largely a refactor of existing behavior, right?

hamilton/function_modifiers_base.py Outdated Show resolved Hide resolved
tests/resources/layered_decorators.py Outdated Show resolved Hide resolved
@elijahbenizzy
Copy link
Collaborator Author

So what's net new here? Just the transform decorator? Suggestion on name: augment. Transform is a bit overloaded a term.

Otherwise the decorator lifecycle is largely a refactor of existing behavior, right?

The decorator lifecycle is solidified and expanded -- moved some things around, but yeah, a refactor of existing behavior. Technically this allows models + parametrization together (whereas before it wouldn't), although that could get a little weird and probably won't be used much.

@elijahbenizzy elijahbenizzy force-pushed the decorator-refactor branch 2 times, most recently from 4e29797 to 376b8c8 Compare December 14, 2021 04:47
@elijahbenizzy elijahbenizzy force-pushed the decorator-refactor branch 2 times, most recently from 0ec3f40 to 096082d Compare February 5, 2022 00:56
@elijahbenizzy elijahbenizzy changed the base branch from main to distributed_prototype February 5, 2022 02:50
@elijahbenizzy elijahbenizzy mentioned this pull request Feb 6, 2022
11 tasks
@elijahbenizzy elijahbenizzy marked this pull request as ready for review February 6, 2022 04:03
Base automatically changed from distributed_prototype to main February 6, 2022 20:54
This adds to the set of decorators that can be applied to a node,
and puts them in a uniform setting. The ordering is provided in
function_base.resolve_nodes -- the algorithm is as follows:

1. If there is a list of function resolvers, apply them one
after the other. Otherwise, apply the default function resolver
which will always return just the function. This determines whether to
proceed -- if any function resolver is none, short circuit and return
an empty list of nodes.

2. If there is a list of node creators, that list must be of length 1
-- this is determined in the node creator class. Apply that to get
the initial node.

3. If there is a list of node expanders, apply them. Otherwise apply the default
nodeexpander This must be a
list of length one. This gives out a list of nodes.

4. If there is a node transformer, apply that. Note that the node transformer
gets applied individually to just the sink nodes in the subdag. It subclasses
"DagTransformer" to do so.

5. Return the final list of nodes.
This allows us to run nodes like the following:

@Transform('a=a+b+c')
def a():
    ...

And now a will be the result of adding a to b to c. Note
a, b, c, currently have to be a node or a config item.

Some caveats:
1. Its unsafe! We use eval now
2. Its just called transform.

Re (2) we should make it so the transform utilizes the same pattern as config:

1. The transform takes in a lambda that takes in a dict
2. transform.to(...) takes in an expression and creates (1) by passing to a superclass
For when the function signature doesn't accurately reflect
the dependency type -- E.G. in created nodes (such as a transform)
in which we want to use kwargs.
There's some subtlety with the name. Previously,
config.when would change the function name accordingly. This
was so that later on decorators could refer to the function by
its true name. However, this broke depending on decorator ordering.
This change adds a sanitize_name functionality to unblock the transform decorator,
but does not yet take away the name changing functionality. We'll need to think through
that, but this is a happy medium for now I think.
This solves #53

This allows us to decouple the notion of dynamic_transform from model.
Model was specific to Stitch Fix's use-case. With this change,
we can build more complex dynamic transforms that aren't limited to pd.Series
or simple combinations of nodes. This is also more readable/makes more inherent
sense. Note that full backwards compatibility has been maintained by subclassing
and keeping parameter names.
This is not going to make it into the final release. Instead see draft PR

#60


class NodeResolver(NodeTransformLifecycle):
"""Decorator to resolve a nodes function. Can modify anything about the function and is run before the node."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this more apt?

Suggested change
"""Decorator to resolve a nodes function. Can modify anything about the function and is run before the node."""
"""Decorator to resolve a nodes function. Can modify anything about the function and is run at DAG creation time."""

pass


class NodeExpander(SubDAGModifier):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc string

@@ -23,6 +24,25 @@ def is_submodule(child: ModuleType, parent: ModuleType):
return parent.__name__ in child.__name__


def custom_subclass_check(requested_type: Type[Type], param_type: Type[Type]):
"""This is a custom check as subclass does not work with python generic.
That said, its
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfinished comment.

- config.when makes it only apply when foo=bar
- @does makes it do the sum pattern
- @parametrized curries the function then turns it into two
- @augment changes the function to make each of those two final nodes take in vars c and d
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

In all, this outputs two total nodes.
- config.when makes it only apply when foo=bar
- @does makes it do the sum pattern
- @parametrized curries the function then turns it into two
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

say what the names of the two nodes will be?

Suggested change
- @parametrized curries the function then turns it into two
- @parametrized curries the function then turns it into two; one named `e` and the other `f`.

- config.when makes it only apply when foo=bar
- @does makes it do the sum pattern
- @parametrized curries the function then turns it into two
- @augment changes the function to make each of those two final nodes take in vars c and d
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

In all, this outputs two total nodes.
- config.when makes it only apply when foo=bar
- @does makes it do the sum pattern
- @parametrized curries the function then turns it into two
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the names here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E and f I think? Added docs.

Comment on lines 6 to 31

@does(_sum)
@parametrized(parameter='a', assigned_output={('e', 'First value'): 10, ('f', 'First value'): 20})
@config.when(foo='bar')
def c__foobar(a: int, b: int) -> int:
"""Demonstrates utilizing a bunch of decorators.
In all, this outputs two total nodes.
- config.when makes it only apply when foo=bar
- @does makes it do the sum pattern
- @parametrized curries the function then turns it into two
- @augment changes the function to make each of those two final nodes take in vars c and d
"""
pass


@does(_sum)
@parametrized(parameter='a', assigned_output={('e', 'First value'): 11, ('f', 'First value'): 22})
@config.when(foo='baz')
def c__foobaz(a: int, b: int) -> int:
"""Demonstrates utilizing a bunch of decorators.
In all, this outputs two total nodes.
- config.when makes it only apply when foo=bar
- @does makes it do the sum pattern
- @parametrized curries the function then turns it into two
- @augment changes the function to make each of those two final nodes take in vars c and d
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably worth commenting about the relationship between these two examples. They're both prefixed with c... I'm actually still confused what the output node names will be, given that e & f overlap...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are mutually disjoint -- e and f will have different values depending on the value of foo

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, duh. But wasn't obvious on first pass :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added docs to clarify :)

@elijahbenizzy elijahbenizzy merged commit 36a4a91 into main Feb 6, 2022
@elijahbenizzy elijahbenizzy deleted the decorator-refactor branch February 6, 2022 21:29
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants