-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(eventuser): Migrate IssuesByTagProcessor away from EventUser #59672
feat(eventuser): Migrate IssuesByTagProcessor away from EventUser #59672
Conversation
08ca9e4
to
eba2bdb
Compare
eba2bdb
to
9eca754
Compare
@@ -97,7 +97,7 @@ def serialize_row(item, key): | |||
} | |||
if key == "user": | |||
euser = item._eventuser | |||
result["id"] = euser.ident if euser else "" | |||
result["id"] = euser.user_id if euser else "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we stop using ident here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm nm, I remember your comment here:
@wedamija the dataclass's id is to mirror the EventUser model id column. user_id in the dataclass will replace the ident column in EventUser
Although now that I'm seeing it it is a little confusing to call it user_id
since it implies an id from the User
model. Should this at least be user_ident
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The EventUser dataclass does not have the property ident
. ident
in the EventUser model is equivalent to the user_id
. This is where we create the EventUser row and you can see that ident
is from user.id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep! I am ok to rename to user_ident
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And updated this PR
9eca754
to
6bba8b7
Compare
src/sentry/utils/eventuser.py
Outdated
result = {} | ||
for value in values: | ||
keyword_filters = {value.split(":", 1)[0]: value.split(":", 1)[-1]} | ||
eventusers = EventUser.for_projects(projects, keyword_filters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Tbh we should probably change for_projects
to just return an EventUser
dataclass or None, rather than having to unwrap it like this.
18c8ee0
to
c2a8110
Compare
19fdff2
to
e4a6c1b
Compare
e4a6c1b
to
2179614
Compare
48747c3
to
02571b5
Compare
f37acc4
to
e59edf3
Compare
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #59672 +/- ##
==========================================
+ Coverage 80.61% 80.76% +0.14%
==========================================
Files 5175 5180 +5
Lines 227259 227578 +319
Branches 38243 38289 +46
==========================================
+ Hits 183214 183795 +581
+ Misses 38464 38187 -277
- Partials 5581 5596 +15
|
src/sentry/utils/eventuser.py
Outdated
) | ||
|
||
if return_all: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not return_all
?
src/sentry/utils/eventuser.py
Outdated
def _find_matching_items( | ||
snuba_results: List[dict[str, Any]], filters: Mapping[str, Any], filter_boolean: str | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to match on the keyword filters. The way to find a unique EventUser
is to look at the sentry:user
tag, which maps to the user
column in Snuba. The value is assigned here:
sentry/src/sentry/event_manager.py
Line 1064 in 27e7250
set_tag(data, "sentry:user", user.tag_value) |
So the user
column will just be the result of
sentry/src/sentry/models/eventuser.py
Lines 85 to 91 in fda93e5
def tag_value(self): | |
""" | |
Return the identifier used with tags to link this user. | |
""" | |
for key, value in self.iter_attributes(): | |
if value: | |
return f"{KEYWORD_MAP[key]}:{value}" |
That tag is mapped to a column here: https://github.com/getsentry/snuba/blob/94c47010b1c6425942684b54d3ead222c0ad181a/snuba/datasets/configuration/events/storages/errors.yaml#L350
So if we had a window function, we'd partition on user
and fetch the first row for each user. In this case, we have to group on all the cols and just select the most recent user in code instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh lol! that would have saved a lot of time yesterday 😭. This simplifies the logic alot.
src/sentry/utils/eventuser.py
Outdated
snuba_keyword_filters[snuba_column] = value | ||
|
||
matches = self._find_matching_items(data_results, snuba_keyword_filters, filter_boolean) | ||
first_matching_items.extend(matches.values()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just assign directly here? Or just use matches.values()
directly to get results
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic has been removed with the simplification of just using tag_value
to get the most recent unique EventUsers
src/sentry/utils/eventuser.py
Outdated
|
||
columns = [ | ||
Column("project_id"), | ||
Column("group_id"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is group_id
for here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I removed it. Its a legacy column from the original snuba query for analytics.
I also removed the user
column from the Snuba query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! Tests are pretty thorough, thanks.
Make sure you monitor this deploy and be ready for a rollback in case we see anything going wrong. Ideally just at a time when we're both not in meetings.
src/sentry/utils/eventuser.py
Outdated
self, | ||
projects: List[Project], | ||
keyword_filters: Mapping[str, List[Any]], | ||
filter_boolean="AND", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Wondering if this should just be the BooleanOp
project_ids=[p.id for p in projects], | ||
query=query.print(), | ||
count_rows_returned=len(data_results), | ||
count_rows_filtered=len(data_results) - len(results), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also be worth including query time here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does raw_snql_query
func return this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, might be in the response, I don't remember. Was thinking you could just keep your own timer here
src/sentry/utils/eventuser.py
Outdated
@@ -72,7 +73,10 @@ def from_event(event: Event) -> EventUser: | |||
project_id=event.project_id if event else None, | |||
email=event.data.get("user", {}).get("email") if event else None, | |||
username=event.data.get("user", {}).get("username") if event else None, | |||
name=event.data.get("user", {}).get("name") if event else None, | |||
name=event.data.get("user", {}).get("name") | |||
or event.data.get("user", {}).get("username") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not totally sure we should put username here, should be ok to leave it blank if not provided (unless there's an example elsewhere where we substitute in username)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good!
src/sentry/utils/eventuser.py
Outdated
for keyword, value in keyword_filters.items(): | ||
if not isinstance(value, list): | ||
raise Exception(f"{keyword} filter must be a list of values") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: ValueError
?
PR reverted: a84ed80 |
…User (#59672)" This reverts commit e3f4723. Co-authored-by: NisanthanNanthakumar <[email protected]>
Objective:
Create the
for_tags
method in the EventUser dataclass. Migratesget_eventuser_callback
to use the dataclass.