Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use PeerID exclusively to address MoE experts #479

Merged
merged 15 commits into from
Jun 7, 2022
Merged

Conversation

justheuristic
Copy link
Member

@justheuristic justheuristic commented Jun 6, 2022

This PR changes declare_experts / RemoteExpert to use only p2p peer ID, not the whole multiaddress.
This slightly reduces the code complexity and gives you an easier time sharing experts with dynamic IP.

It also fixes one DHT edge case i've discovered when working on it.

Minor changes:

  • fixed an edge case: previously, DHT would freeze if accessing DHT.peer_id or otherwise calling .run_coroutine from inside another run_coroutine
  • merged RemoteExpertInfo and UidEndpoint into one structure (ExpertInfo), now in expert_uid.py
  • extracted expert_uid.py from hivemind.moe.server to hivemind.moe in order to avoid circular imports
  • renamed get_expert_stub into get_server_stub since it is not expert-specific

hivemind/moe/client/expert.py Outdated Show resolved Hide resolved
hivemind/moe/server/dht_handler.py Outdated Show resolved Hide resolved
hivemind/moe/server/dht_handler.py Outdated Show resolved Hide resolved
tests/test_dht_experts.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Jun 6, 2022

Codecov Report

Merging #479 (1bcc4a3) into master (c49802a) will decrease coverage by 0.07%.
The diff coverage is 72.72%.

@@            Coverage Diff             @@
##           master     #479      +/-   ##
==========================================
- Coverage   83.02%   82.95%   -0.08%     
==========================================
  Files          83       83              
  Lines        8177     8177              
==========================================
- Hits         6789     6783       -6     
- Misses       1388     1394       +6     
Impacted Files Coverage Δ
hivemind/p2p/p2p_daemon_bindings/datastructures.py 73.58% <ø> (-2.10%) ⬇️
hivemind/moe/client/beam_search.py 56.91% <37.03%> (+2.36%) ⬆️
hivemind/moe/client/expert.py 83.60% <90.00%> (-0.53%) ⬇️
hivemind/dht/dht.py 91.51% <100.00%> (+0.26%) ⬆️
hivemind/moe/client/moe.py 92.78% <100.00%> (ø)
hivemind/moe/client/switch_moe.py 92.50% <100.00%> (ø)
hivemind/moe/expert_uid.py 100.00% <100.00%> (ø)
hivemind/moe/server/dht_handler.py 98.18% <100.00%> (+0.06%) ⬆️
hivemind/moe/server/server.py 80.11% <100.00%> (-0.11%) ⬇️
hivemind/averaging/matchmaking.py 83.92% <0.00%> (-1.49%) ⬇️
... and 2 more

hivemind/moe/server/dht_handler.py Show resolved Hide resolved
hivemind/moe/client/expert.py Outdated Show resolved Hide resolved
Comment on lines 316 to 317
if not best_active_pairs:
break
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand everything right, then without this if we would also break the cycle, because eventually would get empty beam on L330 and reach break at L333. However on L332 we have log.warning, which telling us that this situation should get some attention.
TLDR : shouldn't we do log.warning here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that it should at least require an explanation via a comment/log entry

Copy link
Member Author

@justheuristic justheuristic Jun 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, this if is not necessary
gonna remove it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in fact this if gives some profit. We wont do a lot of calls just to find out that there is nothing to search for. As I understand, there will be 0 net communications in either way, be there will be some small time save.

Anyway, any change is acceptable: to add logging or to remove this if at all.

experts[i] = RemoteExpertInfo(uid, PeerInfo.from_tuple(expert_info_for_uid.value))
server_peer_id = found[uid]
if server_peer_id is not None and isinstance(server_peer_id.value, str):
experts[i] = ExpertInfo(uid, PeerID.from_base58(server_peer_id.value))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This construction (involving PeerID.from_base58) occurs at least 3 times in the codebase, perhaps it's worth changing ExpertInfo into a dataclass with an additional classmethod from_binary for simplicity?

Comment on lines 153 to 157
and isinstance(match, ValueWithExpiration)
and isinstance(match.value, tuple)
and len(match.value) == 2
and is_valid_uid(match.value[0])
and isinstance(match.value[1], str)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe extract this 5-line validation into a function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

hivemind/moe/client/beam_search.py Outdated Show resolved Hide resolved
Comment on lines 316 to 317
if not best_active_pairs:
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that it should at least require an explanation via a comment/log entry

@justheuristic
Copy link
Member Author

the SLOC balance is finally negative :)
image

@borzunov borzunov changed the title Declare / find experts by peer id instead of maddrs Use PeerID exclusively to address MoE experts Jun 7, 2022
@justheuristic justheuristic merged commit 25366a1 into master Jun 7, 2022
@justheuristic justheuristic deleted the server-p2pid branch June 7, 2022 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants