From decd6d791186d606a685f941e95b3c207e67e605 Mon Sep 17 00:00:00 2001 From: Denis Mazur Date: Wed, 10 Mar 2021 20:58:26 +0300 Subject: [PATCH] Rebase master onto libp2p (#179) * copytree implementation for py37 compatibility (#162) * copytree implementation for py37 compatibility * Running tests for python3.7 * Increment version * Python3.7 notions * Remove pickle.loads in averager (#160) * Security update: remove pickle.loads in averager * add py37 to circleci config Co-authored-by: Alexander Borzunov Co-authored-by: Max Ryabinin * Support edge cases for DHT key/subkey/value, add tests, update .gitignore for pb2 (#167) * fix bug with subkey equals zero * add autogenerated protobuf files to .gitignore * test store and get "tricky" values in dht * Fix the remaining tests for py37 (#166) * DecentralizedAverager is now compatible with python37's acyncio exception * the problem was: grpc.aio with python37 raised concurrent.futures.CancelledError in some cases; * we relied on isinstance(asyncio.CancelledError, Exception) == False * but isinstance(concurrent.futures.CancelledError, Exception) == True * DecentralizedAverager now shuts down if dereferenced in the main process * though it won't shutdown if dereferenced in forks for obvious reasons * HIVEMIND_THREADS now actually works * test_averaging now shuts down dht and averager instances to avoid leaking processes Co-authored-by: Max Ryabinin Co-authored-by: Max Ryabinin * Move Averager metadata serialization out of user scope (#168) * move metadata serialization outside user scope * test_overcrowded: reduce the default number of peers * Handle edge cases in DecentralizedAverager (#171) * move metadata serialization outside user scope * retry averager.step on network errors * raise AllreduceException on partial tensor * test split/combine tensors, combine corrupted stream Co-authored-by: Max Ryabinin * Fix a typo in quickstart.md (#174) * Serialize DHTID source with msgpack (#172) * Change DHTID serializer * Remove unused serializers * Add msgpack tuple serialization * Move CLI server launch script to hivemind/hivemind_cli (#173) * Cast environment variables to correct types * Compiling libp2p daemon on setup (#153) * add setup.py prototype * refactor * feat: add p2p daemon (#164) * Add p2p daemon * Test p2p daemon exits correctly * Impose restriction on elapsed time Co-authored-by: Ilya Kobelev * compare golang versions using packaging.version * fix typo Co-authored-by: justheuristic * move p2pd executable to hivemind/hivemind_cli Co-authored-by: Alexey Bukhtiyarov Co-authored-by: justheuristic Co-authored-by: Alexander Borzunov Co-authored-by: Max Ryabinin Co-authored-by: Michael Diskin Co-authored-by: romakail <36082689+romakail@users.noreply.github.com> Co-authored-by: Ilya <37004806+skobellev@users.noreply.github.com> Co-authored-by: Ilya Kobelev --- hivemind/client/averaging/matchmaking.py | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/hivemind/client/averaging/matchmaking.py b/hivemind/client/averaging/matchmaking.py index 8ec866e51..c711f49a6 100644 --- a/hivemind/client/averaging/matchmaking.py +++ b/hivemind/client/averaging/matchmaking.py @@ -467,5 +467,13 @@ async def _declare_averager_periodically(self, key_manager: GroupKeyManager): looking_for_group=False) +def compute_schema_hash(tensors: Sequence[torch.Tensor]) -> bytes: + """ A hash that describes follower's tensor shapes, dtypes, devices, but not the actual values """ + schema_dicts = [{field_name: str(field_value) + for field_name, field_value in asdict(TensorDescriptor.from_tensor(tensor)).items()} + for tensor in tensors] + return DHTID.generate(source=schema_dicts).to_bytes() + + class MatchmakingException(Exception): """ An internal exception that marks undesired edge cases during averaging """