Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explain load balancing for federation_sender_instances #17776

Merged
merged 3 commits into from
Oct 3, 2024

Conversation

dklimpel
Copy link
Contributor

@dklimpel dklimpel commented Oct 1, 2024

Adding information on how the load is distributed for federation_sender_instances.

Thx to @devonh for the information.

causal source:

class ShardedWorkerHandlingConfig:
"""Algorithm for choosing which instance is responsible for handling some
sharded work.
For example, the federation senders use this to determine which instances
handles sending stuff to a given destination (which is used as the `key`
below).
"""
instances: List[str]
def should_handle(self, instance_name: str, key: str) -> bool:
"""Whether this instance is responsible for handling the given key."""
# If no instances are defined we assume some other worker is handling
# this.
if not self.instances:
return False
return self._get_instance(key) == instance_name
def _get_instance(self, key: str) -> str:
"""Get the instance responsible for handling the given key.
Note: For federation sending and pushers the config for which instance
is sending is known only to the sender instance, so we don't expose this
method by default.
"""
if not self.instances:
raise Exception("Unknown worker")
if len(self.instances) == 1:
return self.instances[0]
# We shard by taking the hash, modulo it by the number of instances and
# then checking whether this instance matches the instance at that
# index.
#
# (Technically this introduces some bias and is not entirely uniform,
# but since the hash is so large the bias is ridiculously small).
dest_hash = sha256(key.encode("utf8")).digest()
dest_int = int.from_bytes(dest_hash, byteorder="little")
remainder = dest_int % (len(self.instances))
return self.instances[remainder]

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct
    (run the linters)

@github-actions github-actions bot deployed to PR Documentation Preview October 1, 2024 04:57 Active
@github-actions github-actions bot deployed to PR Documentation Preview October 1, 2024 04:59 Active
@dklimpel dklimpel marked this pull request as ready for review October 1, 2024 04:59
@dklimpel dklimpel requested a review from a team as a code owner October 1, 2024 04:59
Copy link
Member

@devonh devonh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tweaked the wording a tiny bit.

Thank you very much for adding this clarification!

@github-actions github-actions bot deployed to PR Documentation Preview October 3, 2024 21:51 Active
@devonh devonh merged commit 8bbe66a into element-hq:develop Oct 3, 2024
33 checks passed
@dklimpel dklimpel deleted the lb_fed_sender branch October 4, 2024 05:23
yingziwu added a commit to yingziwu/synapse that referenced this pull request Oct 17, 2024
No significant changes since 1.117.0rc1.

- Add config option `redis.password_path`. ([\#17717](element-hq/synapse#17717))

- Fix a rare bug introduced in v1.29.0 where invalidating a user's access token from a worker could raise an error. ([\#17779](element-hq/synapse#17779))
- In the response to `GET /_matrix/client/versions`, set the `unstable_features` flag for [MSC4140](matrix-org/matrix-spec-proposals#4140) to `false` when server configuration disables support for delayed events. ([\#17780](element-hq/synapse#17780))
- Improve input validation and room membership checks in admin redaction API. ([\#17792](element-hq/synapse#17792))

- Clarify the docstring of `test_forget_when_not_left`. ([\#17628](element-hq/synapse#17628))
- Add documentation note about PYTHONMALLOC for accurate jemalloc memory tracking. Contributed by @hensg. ([\#17709](element-hq/synapse#17709))
- Remove spurious "TODO UPDATE ALL THIS" note in the Debian installation docs. ([\#17749](element-hq/synapse#17749))
- Explain how load balancing works for `federation_sender_instances`. ([\#17776](element-hq/synapse#17776))

- Minor performance increase for large accounts using sliding sync. ([\#17751](element-hq/synapse#17751))
- Increase performance of the notifier when there are many syncing users. ([\#17765](element-hq/synapse#17765), [\#17766](element-hq/synapse#17766))
- Fix performance of streams that don't change often. ([\#17767](element-hq/synapse#17767))
- Improve performance of sliding sync connections that do not ask for any rooms. ([\#17768](element-hq/synapse#17768))
- Reduce overhead of sliding sync E2EE loops. ([\#17771](element-hq/synapse#17771))
- Sliding sync minor performance speed up using new table. ([\#17787](element-hq/synapse#17787))
- Sliding sync minor performance improvement by omitting unchanged data from incremental responses. ([\#17788](element-hq/synapse#17788))
- Speed up sliding sync when there are many active subscriptions. ([\#17789](element-hq/synapse#17789))
- Add missing license headers on new source files. ([\#17799](element-hq/synapse#17799))

* Bump phonenumbers from 8.13.45 to 8.13.46. ([\#17773](element-hq/synapse#17773))
* Bump python-multipart from 0.0.10 to 0.0.12. ([\#17772](element-hq/synapse#17772))
* Bump regex from 1.10.6 to 1.11.0. ([\#17770](element-hq/synapse#17770))
* Bump ruff from 0.6.7 to 0.6.8. ([\#17774](element-hq/synapse#17774))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants