-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix perf when streams don't change often #17767
Conversation
There is a bug with the `StreamChangeCache` where it would incorrectly return that all entities had changed if asked for entities changed *since* the earliest stream position.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this looks correct, I think it is missing 1 tweak to the tests though.
# _cache is not valid before the earliest known stream position, so | ||
# return that the entity has changed. | ||
if stream_pos <= self._earliest_known_stream_pos: | ||
if stream_pos < self._earliest_known_stream_pos: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had a long conversation about this @ matrix-org/synapse#14435 (comment), during which I said:
I'm a bit skeptical of the resulting consistency between strict and non-strict comparisons, but would like a double checking.
Your response was:
as if we call
entity_has_changed
for a stream pos thenhas_entity_changed
must return True for that stream pos. Either because we remember the entity has changed or because the stream pos is before the minimum threshold
It looks like I interpreted "before" incorrectly in that case, since at the stream position we do have the "proper" information needed to calculate whether there were changes or not.
The comment for _earliest_known_stream_pos
even makes this clear:
synapse/synapse/util/caches/stream_change_cache.py
Lines 91 to 93 in ef9ef99
# the earliest stream_pos for which we can reliably answer | |
# get_all_entities_changed. In other words, one less than the earliest | |
# stream_pos for which we know _cache is valid. |
tl;dr Before #14435 we were inconsistently checking if it was strictly before or before-or-equal, we made them consistent in that PR, but to the wrong one.
self.assertTrue(cache.has_entity_changed("[email protected]", 2)) | ||
self.assertTrue(cache.has_entity_changed("[email protected]", 2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add checks for 3
(near line 35)?
Co-authored-by: Andrew Morgan <[email protected]>
No significant changes since 1.117.0rc1. - Add config option `redis.password_path`. ([\#17717](element-hq/synapse#17717)) - Fix a rare bug introduced in v1.29.0 where invalidating a user's access token from a worker could raise an error. ([\#17779](element-hq/synapse#17779)) - In the response to `GET /_matrix/client/versions`, set the `unstable_features` flag for [MSC4140](matrix-org/matrix-spec-proposals#4140) to `false` when server configuration disables support for delayed events. ([\#17780](element-hq/synapse#17780)) - Improve input validation and room membership checks in admin redaction API. ([\#17792](element-hq/synapse#17792)) - Clarify the docstring of `test_forget_when_not_left`. ([\#17628](element-hq/synapse#17628)) - Add documentation note about PYTHONMALLOC for accurate jemalloc memory tracking. Contributed by @hensg. ([\#17709](element-hq/synapse#17709)) - Remove spurious "TODO UPDATE ALL THIS" note in the Debian installation docs. ([\#17749](element-hq/synapse#17749)) - Explain how load balancing works for `federation_sender_instances`. ([\#17776](element-hq/synapse#17776)) - Minor performance increase for large accounts using sliding sync. ([\#17751](element-hq/synapse#17751)) - Increase performance of the notifier when there are many syncing users. ([\#17765](element-hq/synapse#17765), [\#17766](element-hq/synapse#17766)) - Fix performance of streams that don't change often. ([\#17767](element-hq/synapse#17767)) - Improve performance of sliding sync connections that do not ask for any rooms. ([\#17768](element-hq/synapse#17768)) - Reduce overhead of sliding sync E2EE loops. ([\#17771](element-hq/synapse#17771)) - Sliding sync minor performance speed up using new table. ([\#17787](element-hq/synapse#17787)) - Sliding sync minor performance improvement by omitting unchanged data from incremental responses. ([\#17788](element-hq/synapse#17788)) - Speed up sliding sync when there are many active subscriptions. ([\#17789](element-hq/synapse#17789)) - Add missing license headers on new source files. ([\#17799](element-hq/synapse#17799)) * Bump phonenumbers from 8.13.45 to 8.13.46. ([\#17773](element-hq/synapse#17773)) * Bump python-multipart from 0.0.10 to 0.0.12. ([\#17772](element-hq/synapse#17772)) * Bump regex from 1.10.6 to 1.11.0. ([\#17770](element-hq/synapse#17770)) * Bump ruff from 0.6.7 to 0.6.8. ([\#17774](element-hq/synapse#17774))
There is a bug with the
StreamChangeCache
where it would incorrectly return that all entities had changed if asked for entities changed since the earliest stream position.Note that for streams we use the inequalities:
$min_stream_id < stream_id <= $max_stream_id
, i.e. when we ask the stream change cache for all things that have changed since$stream_id
we don't care for events that happened at$stream_id
.Specifically:
_earliest_known_stream_pos
is the position at which we know that we'll have entries for all changes since that point, we can use the cache for any stream IDs that equal_earliest_known_stream_pos
._earliest_known_stream_pos
is set in three places:This was changed in matrix-org/synapse#14435, but I think we were overly conservative there.