Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protect training progress and metrics with signatures and DHT schema validation #250

Merged
merged 15 commits into from
May 5, 2021

Conversation

borzunov
Copy link
Member

@borzunov borzunov commented Apr 29, 2021

This PR follows #219 and implements:

  1. Validating DHT schema for training progress and metrics
  2. Signing the training progress and the metrics (so a peer can't change the stats of someone else)
  3. Using the local public key as a replacement for the peer's UUID
  4. Using type-validated data structures for the stats in the code instead of plain lists

I have tested the following:

  1. Training (run_first_peer.py and run_trainer.py) works without errors.
  2. When connecting to the DHT from the Python interactive console, the DHT does not accept progress/metrics records that:
    • Doesn't match the schema.
    • Doesn't have a signature or have an invalid one.

Along the way, it improves the validator system:

  1. Makes RSASignatureValidator and SchemaValidator picklable because their instances may be sent to the DHT in another process (this happens when we add validators to the existing DHT process after its creation).
  2. Implements caching of the private key while creating RSASignatureValidator, so it is not generated again for each DHT-using entity inside one process (saves time because the generation takes ~100 ms).
  3. Implements the BytesWithPublicKey type for DHT schemas.
  4. Adds prefix parameter to SchemaValidator (if present, adds prefix + '_' to all field names) and moves them from the bytes to str keys (since the rest of code uses str keys).
  5. Refactors RSASignatureValidator (e.g. better naming like signature_validator.ownership_marker -> signature_validator.local_public_key).

@borzunov borzunov changed the title Validate DHT schema for training progress and metrics Protect training progress and metrics with signatures and schema validation Apr 29, 2021
@borzunov borzunov changed the title Protect training progress and metrics with signatures and schema validation Protect training progress and metrics with signatures and DHT schema validation Apr 29, 2021
@borzunov borzunov added dht security Security issues or improvements labels Apr 29, 2021
@yhn112 yhn112 self-requested a review April 29, 2021 05:38
@yhn112 yhn112 requested a review from mryab April 29, 2021 16:43
@borzunov borzunov marked this pull request as ready for review May 1, 2021 05:39
examples/albert/run_trainer.py Show resolved Hide resolved


def make_validators(experiment_prefix: str) -> Tuple[List[RecordValidatorBase], bytes]:
signature_validator = RSASignatureValidator()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it generate a new key pair after restarting an experiment on the same peer?

Copy link
Member Author

@borzunov borzunov May 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. However, the public key here replaces the trainer UUID that was also randomly regenerated after each restart.

Do you think having a constant key pair (e.g. for one machine) may be useful? I imagine cases like implementing a leaderboard, however it may be a better idea to aggregate peers with different key pairs by endpoint/login at this stage (since one user may want to run several trainers that will have different IDs = different key pairs).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's reasonable to assume that a single user shouldn't have to generate new key pairs for each run just for convenience. However, we can probably leave it as is for now if nobody objects

hivemind/dht/crypto.py Show resolved Hide resolved
hivemind/dht/crypto.py Show resolved Hide resolved
hivemind/optim/collaborative.py Outdated Show resolved Hide resolved
hivemind/optim/collaborative.py Outdated Show resolved Hide resolved
hivemind/optim/collaborative.py Outdated Show resolved Hide resolved
tests/test_dht_crypto.py Outdated Show resolved Hide resolved
@borzunov borzunov requested a review from mryab May 3, 2021 14:08
hivemind/dht/crypto.py Outdated Show resolved Hide resolved
hivemind/optim/collaborative.py Outdated Show resolved Hide resolved
@borzunov borzunov requested a review from leshanbog May 3, 2021 15:32
@borzunov borzunov merged commit 3bde618 into master May 5, 2021
@borzunov borzunov deleted the progress-metrics-schemas branch May 5, 2021 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dht security Security issues or improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants