-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(event_sources): implement Mapping protocol on DictWrapper for better interop with existing middlewares #1516
Conversation
9633617
to
8e5efa9
Compare
8e5efa9
to
f8d5692
Compare
Has something gone wild with the git history? I definitely did not change 52 files. =) Gimme a bit to fix. |
@Tankanow yes we had a small problem with a pipeline process. I think the best way is to reset your local repo from |
f8d5692
to
26c2135
Compare
…methods to DictWrapper ISSUE-1503: Add DictWrapper Mapping abc tests ISSUE-1503: add StreamRecord tests
26c2135
to
c5394f5
Compare
|
||
|
||
class DictWrapper: | ||
class DictWrapper(Mapping): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: I thought for a long time whether or not to explicitly add the Mapping
protocol here or to simply implement the missing dunder methods and allow duck-typing to carry the day.
In the end, I decided it was better to explicitly add the Mapping
protocol because, all things being equal, I think it is bad for subclasses of this class to break that protocol. It is good for mypy
to warn when an implementer chooses to break the protocol. It should require an explicit type: ignore
comment when choosing to break the protocol.
There is one existing class, StreamRecord
, that breaks the protocol. There are no subclasses of StreamRecord
and it is used only within the context of a DynamoDBRecord
object. Because there is only one instance of this type breakage, and it is so localized, I marked StreamRecord.keys
as type: ignore
and added a comment and some tests to indicate that this class explicitly breaks the Mapping
protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In older days I'd ask to remove it as we're breaking it in StreamRecord
- I'd take this as an instance of Practicality Beats Purity
.
It's highly unlikely a customer will use .keys
to get the keys of a StreamRecord
dict given the intent. However, if we do receive any customer complaint we'll consider removing the subclassing.
@rubenfonseca, this is ready for review. I added a review comment to describe one design choice. I'm excited for your feedback. |
Codecov ReportBase: 99.73% // Head: 99.73% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## develop #1516 +/- ##
========================================
Coverage 99.73% 99.73%
========================================
Files 127 127
Lines 5729 5733 +4
Branches 651 651
========================================
+ Hits 5714 5718 +4
Misses 8 8
Partials 7 7
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Thank you so much for following up! After discussing this internally, we would like to move the discussion to our v2 release. We are concerned with the changes that this PR might introduce to consumers of Action: move PR base to v2 |
Adding to @rubenfonseca comments on our concern about the MappingInterface. The way Event Source Data Classes was done it abuses the word "Dict", when in fact is a class. If we implement an iter for instance, it wouldn't be true for most Event Source Data Classes as the data varies in format - e.g., DynamoDB keeps a list of records in data.records, while API GW keeps in data.body. A long-term strategy is to get rid of the duplication of Event Source Data Classes and Parser (Pydantic). We're waiting for Pydantic v2 to be launched to analyse its package size -- doing so would improve speed (~17x faster), validation, and open the door to more easily implement strict typing for all let alone a single source of truth for Event Source Models. We'd still love to hear from you if you feel strongly inclined in having the protocol implemented either way in v2 |
Thanks for the response @heitorlessa. I respectfully disagree on this point:
The To this point, in practice - that is in the entire code base - I could only find 1 subclass that breaks the interface. It seems that there are no consumers that expect the behavior you are worried about, i.e. that iter should behave differently for different event subclasses. Lastly, it is true that this is a breaking change for anyone using Powertools who has currently implemented a negative check on the from aws_lambda_powertools.utilities.data_classes import APIGatewayProxyEvent
from aws_lambda_powertools.utilities.data_classes.common import DictWrapper
from typing import Mapping, Union
def handler(event: Union[Mapping, DictWrapper], context):
event_dict = event.raw_event if not isinstance(event, Mapping) else event
return event_dict
event_dict = {'foo': 'bar'}
event_obj = APIGatewayProxyEvent(event_dict)
assert handler(event_dict, None) == handler(event_obj, None) I feel like it is a corner case. As I feel like most consumers would use the positive check |
Ah, I missed an important detail on iter - you want to access the raw_event property (actual dict), and not the data property where a payload might be (records, body etc). If I got that right I take it back what I alluded earlier with different subclasses, and I fully agree we should do that. I don't have data points on whether people are checking the superclass instance (DictWrapper) as opposed to the Event Source Class itself like we do even internally -- for the sake of Hyrum's law as customers might have taken a dependency on this, we should do it safely for V2 and document in the upgrade guide. Let me know if I still haven't processed this correctly |
Yes!
I totally agree. I regret missing this when I opened this PR against V1. I'm glad you and @rubenfonseca caught it. I love your commitment to non-breaking changes as even breaking changes in a major release can be really frustrating to users. I would be happy to help V2 by testing against our code bases. Please let me know how I can help. |
Perfect, we have a plan then. Could you create a PR implementing the mapping interface against the v2 branch? We can take specific questions, if any, about implementation and the messaging on upgrade guide. @rubenfonseca @leandrodamascena and I can support you Thank you!!! |
@heitorlessa and @rubenfonseca, Regarding this request:
It looks like someone moved this PR destination to |
@Tankanow we'll be looking into the PR this week ;) I mistakenly added a comment in the PR asking to create a PR :D, I'll be flying back home tomorrow, so EOW in the worst case. |
Looking into this now... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @Tankanow - a minor note on subclassing (practicality vs purity) with our intent to keep but will revert if it affects any use case we couldn't account for, and code suggestions to improve maintenance.
We can merge once these are in ;)
Thank you!!!!
|
||
|
||
class DictWrapper: | ||
class DictWrapper(Mapping): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In older days I'd ask to remove it as we're breaking it in StreamRecord
- I'd take this as an instance of Practicality Beats Purity
.
It's highly unlikely a customer will use .keys
to get the keys of a StreamRecord
dict given the intent. However, if we do receive any customer complaint we'll consider removing the subclassing.
aws_lambda_powertools/utilities/data_classes/dynamo_db_stream_event.py
Outdated
Show resolved
Hide resolved
aws_lambda_powertools/utilities/data_classes/dynamo_db_stream_event.py
Outdated
Show resolved
Hide resolved
aws_lambda_powertools/utilities/data_classes/dynamo_db_stream_event.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Heitor Lessa <[email protected]>
Just re-checking this PR, I agree that we should merge this into |
Thanks for addressing the feedback @Tankanow - merging now. THANK YOU <3 |
Issue #1503
Summary
DictWrapper and all of its subclasses should fully implement the Mapping interface so that it can be used interchangeably with code that expects the event to be a dictionary. This helps Powertools Middleware play nicely with existing middleware.
Changes
Add
Mapping
as parent class ofDictWrapper
, implement missing__len__
and__iter__
abstract methods.User experience
Before this change, a user would encounter exceptions when trying to use many standard functions on dictionaries coerced into
DictWrapper
. Though a user could access the raw dictionary with theraw_event
property, this was often clumsy, causing existing middleware to break (or to be order dependent) or causing the user to write lots of type-checking code.Checklist
If your change doesn't seem to apply, please leave them unchecked.
Is this a breaking change? No
RFC issue number:
Checklist:
Acknowledgment
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.
View rendered CHANGELOG.md