Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] MoE enhancements #478

Open
3 of 9 tasks
GreenFatGuy opened this issue Jun 4, 2022 · 1 comment
Open
3 of 9 tasks

[Feature Request] MoE enhancements #478

GreenFatGuy opened this issue Jun 4, 2022 · 1 comment
Labels
enhancement New feature or request help wanted Extra attention is needed p2p Everything related to the libp2p-daemon. server

Comments

@GreenFatGuy
Copy link
Collaborator

GreenFatGuy commented Jun 4, 2022

The nature of this issue

During #470 review there was a list of thing that were not crucial for the PR but ideally they should be done. Find problems description bellow.
This issue is quite general and contains all problems found during PR.

Use only PeerID

The idea of p2p is that PeerID is enough to communicate with another daemon, and Multiaddr is needed only to start a new node. Thus we should decrease usage of Multiaddr where it is possible.

Move cpu-bound things inside separate executor

There are some places in code (for example forward/backward for moe.client.expert) where cpu-bound things, such as serialization/deserialization take place inside async task. In order to increase efficiency they are better to be moved inside thread executor

Check inputs on server side

Currently hivemind.Server does not check that inputs are correct. If user sends malformed inputs, it may OOM the server. We should check for that in some future PR. See #3

Sending empty input causes exception

If clients sends tensor of shape [0, ...] (empty tensor), then it will be split into zero messages and uid will not be passed. Server will receive uid=None and fail with cryptic KeyError(None). We should either forbid this on client side or ensure that zero-element tensors are serialized into a stream with first empty message.

MoE operates only with lists of tensors

The code expects inputs/ouputs to be Iterable[torch.Tensor], however it can have more complex structure, such as dict with meta information.

Test load balancing for unary handlers on python side

Load balancing is tested inside libp2p-daemon itself and also we have some tests covering stream handlers. However there is zero tests on load balancing of unary handlers on hivemind side.

Remove gRPC-specific Python file compilation

Since gRPC-based communication is no longer present in hivemind, we can remove the corresponding compilation commands from setup.py

Add --identity_path to run_server.py

Similarly to examples/albert, it would be great to have an option to fix the libp2p address of the server.

TODO List:

  • Use only PeerID where it possible
  • Move cpu-bound things inside separate executor
  • Check inputs on server side
  • Sending empty input causes exception
  • MoE operates only with lists of tensors
  • Test load balancing for unary handlers on python side
  • Remove gRPC-specific Python file compilation
  • Add --identity_path to run_server.py
  • make PeerID and ExpertData msgpack-serializable
@GreenFatGuy GreenFatGuy added enhancement New feature or request help wanted Extra attention is needed server p2p Everything related to the libp2p-daemon. labels Jun 4, 2022
@mryab mryab changed the title [Feature Request] MoE enhancments [Feature Request] MoE enhancements Jun 4, 2022
@justheuristic
Copy link
Member

justheuristic commented Jun 23, 2022

Remove gRPC-specific Python file compilation

fixed in #485

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed p2p Everything related to the libp2p-daemon. server
Projects
None yet
Development

No branches or pull requests

2 participants