-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop fitting pipeline after last fit block #132
Conversation
mlblocks/mlpipeline.py
Outdated
if block_name == self._last_fit_block: | ||
early_stop = True | ||
|
||
if (not early_stop) or (block_name in output_blocks): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The (block_name in output_blocks)
should be changed to just output_blocks
.
The logic here is:
- If
output_blocks
is not empty, there is still at least 1 block from which we want the output - The block could not be "this one", but rather one that is later down the chain.
- If we check for name match, we may skip the current block, and then we will not be able to produce the next ones, which we want to capture.
So, the if
here would be:
if fit_pending or output_blocks:
self._produce_block...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to modify below, on line 798 (I cannot add a comment there because it has no changes):
The if output_variables is not None and not output_blocks
can be changed to:
if output_variables:
if not output_blocks:
# do the return logic here
elif not fit_pending:
return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @csala for the review. I went ahead with the changes, I'm wondering if the return logic is a bit difficult to trace now
for block_name, block in self.blocks.items():
# perform regular procedure
if output_variables:
if not output_blocks:
# find result
if debug:
return result, debug_info
return result
elif not fit_pending:
if debug:
return debug_info
return
# start_ check
if debug:
return debug_info
Ideally, it's a lot easier for me to see what is happening when there are less returns
to follow. I don't have a solution to this but I would prefer to break
instead and let the last return carry it out.
if output_variables:
if not output_blocks:
# return logic
elif not fit_pending:
break
mlblocks/mlpipeline.py
Outdated
@@ -767,6 +771,7 @@ def fit(self, X=None, y=None, output_=None, start_=None, debug=False, **kwargs): | |||
debug_info = defaultdict(dict) | |||
debug_info['debug'] = debug.lower() if isinstance(debug, str) else 'tmio' | |||
|
|||
early_stop = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
early_stop
sounds like an extraordinary thing but this functionality is just regular behavior.
I suggest inverting the meaning of the variable and calling it fit_pending
: "If True
, there are still blocks that are pending to be fitted"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
mlblocks/mlpipeline.py
Outdated
@@ -777,7 +782,10 @@ def fit(self, X=None, y=None, output_=None, start_=None, debug=False, **kwargs): | |||
|
|||
self._fit_block(block, block_name, context, debug_info) | |||
|
|||
if (block_name != self._last_block_name) or (block_name in output_blocks): | |||
if block_name == self._last_fit_block: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this check above line 776, so this is the first thing that we do on the loop.
The reason is that if we have a start_
that falls after the last_fit_block
, we will skip this and keep producing blocks until the end unnecessarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Resolve #131.