-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make keying of examples explicit. #21777
Conversation
This decouples the keying logic from the DoFn and helps with type inference. A MaybeKeyedModelLoader could be added to make this decision dynamically if desired.
R: @yeandy |
Codecov Report
@@ Coverage Diff @@
## master #21777 +/- ##
=======================================
Coverage 74.01% 74.02%
=======================================
Files 698 698
Lines 92224 92229 +5
=======================================
+ Hits 68263 68270 +7
+ Misses 22710 22708 -2
Partials 1251 1251
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Added maybe keyed class and rebased atop ModelLoader merge. PTAL. |
@@ -119,7 +119,7 @@ def run_inference( | |||
predictions = model(batched_tensors, **prediction_params) | |||
return [PredictionResult(x, y) for x, y in zip(batch, predictions)] | |||
|
|||
def get_num_bytes(self, batch: List[torch.Tensor]) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the same Sequence
change for sklearn_inference.py
lines 78, 89, 97, 116, 121
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -93,6 +95,100 @@ def batch_elements_kwargs(self) -> Mapping[str, Any]: | |||
return {} | |||
|
|||
|
|||
class KeyedModelHandler(Generic[KeyT, ExampleT, PredictionT, ModelT], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there's any way to make this more readable or simple. All these nested lists are making my eyes a little buggy.
Can we perhaps use constants here?
BASIC_MODEL_HANDLER = ModelHandler[Tuple[KeyT, ExampleT]
KEYED_PREDICTION = Tuple[KeyT, PredictionT]
Or are there any other ways to make some of this templating go away?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played around with this a bit, don't see a way to really make things much simpler. It is possible delete the generic, but then it becomes harder to reason about the order of the nested arguments.
* refactor code from api to base * delete api.py * modify imports * Add todo to mypy github issue * Refactor code to reflect changes of #21777 * Refactor example with KeyedModelHandler * remove explicit type hints from RunInference class * Fixup : Lint * remove TODO to github issue for mypy error * Add mypy github issue as TODO
This decouples the keying logic from the DoFn and helps with type inference. There is both a KeyedModelHandler that expects keys and a MaybeKeyedModelHandler that preserves the old behavior.
* refactor code from api to base * delete api.py * modify imports * Add todo to mypy github issue * Refactor code to reflect changes of apache#21777 * Refactor example with KeyedModelHandler * remove explicit type hints from RunInference class * Fixup : Lint * remove TODO to github issue for mypy error * Add mypy github issue as TODO
This decouples the keying logic from the DoFn and helps with type inference.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.