Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order heroes by stream_ordering (as spec'ed) #17435

Merged
merged 7 commits into from
Jul 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/17435.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Order `heroes` by `stream_ordering` as the Matrix specification states (applies to `/sync`).
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
57 changes: 45 additions & 12 deletions synapse/storage/databases/main/roommember.py
Original file line number Diff line number Diff line change
Expand Up @@ -279,8 +279,19 @@ def _get_users_in_room_with_profiles(

@cached(max_entries=100000) # type: ignore[synapse-@cached-mutable]
async def get_room_summary(self, room_id: str) -> Mapping[str, MemberSummary]:
"""Get the details of a room roughly suitable for use by the room
"""
Get the details of a room roughly suitable for use by the room
summary extension to /sync. Useful when lazy loading room members.

Returns the total count of members in the room by membership type, and a
truncated list of members (the heroes). This will be the first 6 members of the
room:
- We want 5 heroes plus 1, in case one of them is the
calling user.
- They are ordered by `stream_ordering`, which are joined or
invited. When no joined or invited members are available, this also includes
banned and left users.

Args:
room_id: The room ID to query
Returns:
Expand Down Expand Up @@ -308,23 +319,36 @@ def _get_room_summary_txn(
for count, membership in txn:
res.setdefault(membership, MemberSummary([], count))

# we order by membership and then fairly arbitrarily by event_id so
# heroes are consistent
# Note, rejected events will have a null membership field, so
# we we manually filter them out.
# Order by membership (joins -> invites -> leave (former insiders) ->
# everything else (outsiders like bans/knocks), then by `stream_ordering` so
# the first members in the room show up first and to make the sort stable
# (consistent heroes).
#
# Note: rejected events will have a null membership field, so we we manually
# filter them out.
sql = """
SELECT state_key, membership, event_id
FROM current_state_events
WHERE type = 'm.room.member' AND room_id = ?
AND membership IS NOT NULL
ORDER BY
CASE membership WHEN ? THEN 1 WHEN ? THEN 2 ELSE 3 END ASC,
event_id ASC
CASE membership WHEN ? THEN 1 WHEN ? THEN 2 WHEN ? THEN 3 ELSE 4 END ASC,
event_stream_ordering ASC
Comment on lines +335 to +336
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By changing this now, it will probably cause peoples rooms names to calculate to something different across the board (this function is also used by Sync v2). Are we ok with that? Potentially disruptive if people are really used to their old "flawed" name.

It's already a FIXME comment in extract_heroes_from_room_summary(...) but we still need to pull in the right order from the database.

Related to matrix-org/matrix-spec#1334

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To note the "flaws" with the current implementation sorting by event_id, the heroes will shift around as people join because their join event_id might sort lexicographically before others. The heroes will also shift as people change their display name and avatar because their join event_id will change. It also simply doesn't grab the first 5 members of the room as expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as you say, it's already unstable. I don't see any harm in changing this.

MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
LIMIT ?
"""

# 6 is 5 (number of heroes) plus 1, in case one of them is the calling user.
txn.execute(sql, (room_id, Membership.JOIN, Membership.INVITE, 6))
txn.execute(
sql,
(
room_id,
# Sort order
Membership.JOIN,
Membership.INVITE,
Membership.LEAVE,
# 6 is 5 (number of heroes) plus 1, in case one of them is the calling user.
6,
),
)
for user_id, membership, event_id in txn:
summary = res[membership]
# we will always have a summary for this membership type at this
Expand Down Expand Up @@ -1509,10 +1533,19 @@ def extract_heroes_from_room_summary(
) -> List[str]:
"""Determine the users that represent a room, from the perspective of the `me` user.

This function expects `MemberSummary.members` to already be sorted by
`stream_ordering` like the results from `get_room_summary(...)`.

The rules which say which users we select are specified in the "Room Summary"
section of
https://spec.matrix.org/v1.4/client-server-api/#get_matrixclientv3sync


Args:
details: Mapping from membership type to member summary. We expect
`MemberSummary.members` to already be sorted by `stream_ordering`.
me: The user for whom we are determining the heroes for.

Returns a list (possibly empty) of heroes' mxids.
"""
empty_ms = MemberSummary([], 0)
Expand All @@ -1527,11 +1560,11 @@ def extract_heroes_from_room_summary(
r[0] for r in details.get(Membership.LEAVE, empty_ms).members if r[0] != me
] + [r[0] for r in details.get(Membership.BAN, empty_ms).members if r[0] != me]

# FIXME: order by stream ordering rather than as returned by SQL
# We expect `MemberSummary.members` to already be sorted by `stream_ordering`
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
if joined_user_ids or invited_user_ids:
return sorted(joined_user_ids + invited_user_ids)[0:5]
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
return (joined_user_ids + invited_user_ids)[0:5]
else:
return sorted(gone_user_ids)[0:5]
return gone_user_ids[0:5]


@attr.s(slots=True, auto_attribs=True)
Expand Down
21 changes: 9 additions & 12 deletions tests/rest/client/test_sync.py
Original file line number Diff line number Diff line change
Expand Up @@ -2160,18 +2160,15 @@ def test_rooms_meta_heroes_max(self) -> None:

# Room2 doesn't have a name so we should see `heroes` populated
self.assertIsNone(channel.json_body["rooms"][room_id1].get("name"))
# FIXME: Remove this basic assertion and uncomment the better assertion below
# after https://github.com/element-hq/synapse/pull/17435 merges
self.assertEqual(len(channel.json_body["rooms"][room_id1].get("heroes", [])), 5)
# self.assertCountEqual(
# [
# hero["user_id"]
# for hero in channel.json_body["rooms"][room_id1].get("heroes", [])
# ],
# # Heroes should be the first 5 users in the room (excluding the user
# # themselves, we shouldn't see `user1`)
# [user2_id, user3_id, user4_id, user5_id, user6_id],
# )
self.assertCountEqual(
[
hero["user_id"]
for hero in channel.json_body["rooms"][room_id1].get("heroes", [])
],
# Heroes should be the first 5 users in the room (excluding the user
# themselves, we shouldn't see `user1`)
[user2_id, user3_id, user4_id, user5_id, user6_id],
)
self.assertEqual(
channel.json_body["rooms"][room_id1]["joined_count"],
7,
Expand Down
Loading
Loading