Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize ADR-028 #8398

Merged
merged 32 commits into from
Feb 4, 2021
Merged
Changes from 2 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
7b8173b
Finalize ADR-028
robert-zaremba Jan 20, 2021
8643cae
Merge branch 'master' into robert/adr-28
robert-zaremba Jan 20, 2021
6038410
added note about Named Accounts
robert-zaremba Jan 20, 2021
aecb45b
Update docs/architecture/adr-028-public-key-addresses.md
Jan 20, 2021
9a4e078
add reference to \#8041
robert-zaremba Jan 20, 2021
bbfb1b7
typo fix
robert-zaremba Jan 20, 2021
8129844
Apply suggestions from code review
robert-zaremba Jan 21, 2021
2724c7b
simplification, review updates
robert-zaremba Jan 22, 2021
8c44486
date update
robert-zaremba Jan 22, 2021
1557ef5
update paragraph about proto.MessageName
robert-zaremba Jan 22, 2021
aaf471b
remove the module name section - it's described in other section
robert-zaremba Jan 22, 2021
c230c68
Update docs/architecture/adr-028-public-key-addresses.md
robert-zaremba Jan 22, 2021
9b21ec5
remove blake2b
robert-zaremba Jan 27, 2021
9550d11
move some paragraphs to 'Further Discussion'
robert-zaremba Jan 27, 2021
c0a9b20
renames
robert-zaremba Jan 27, 2021
84f57b8
revert and merge 'Multisig Addresses' section
robert-zaremba Jan 27, 2021
c8fa6cb
Apply suggestions from code review
robert-zaremba Jan 29, 2021
3d29aff
Merge branch 'master' into robert/adr-28
robert-zaremba Jan 29, 2021
e407672
add LengthPrefix to Compose definition
robert-zaremba Jan 29, 2021
889b9c5
Merge branch 'master' into robert/adr-28
Jan 29, 2021
ab4b9d1
move composing module accounts to a new subsectoin
robert-zaremba Jan 30, 2021
834d997
adding appendix from meeting with Alan
robert-zaremba Jan 31, 2021
3ceb0f2
composed addresses: use LengthPrefix before sorting
robert-zaremba Feb 1, 2021
274efce
Update docs/architecture/adr-028-public-key-addresses.md
robert-zaremba Feb 1, 2021
461c703
Update docs/architecture/adr-028-public-key-addresses.md
robert-zaremba Feb 1, 2021
201a19e
adding special case for Module Account Addresses
robert-zaremba Feb 1, 2021
5093022
limit 'account' word usage
robert-zaremba Feb 1, 2021
6fb8701
describe submodule derivation
robert-zaremba Feb 2, 2021
eb23aa6
Add discussion notes for the module addresses + bring back 'account' …
robert-zaremba Feb 3, 2021
963b093
Apply suggestions from code review
robert-zaremba Feb 3, 2021
c79c796
changing back to proposed
robert-zaremba Feb 3, 2021
eae740c
Merge branch 'master' into robert/adr-28
robert-zaremba Feb 3, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
219 changes: 133 additions & 86 deletions docs/architecture/adr-028-public-key-addresses.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,49 +3,82 @@
## Changelog

- 2020/08/18: Initial version
- 2020/08/15: Analysis and algorithm update

## Status

Proposed
LAST CALL 2021-01-22

## Abstract

This ADR defines a canonical 20-byte address format for new public key algorithms, multisig public keys, and module
accounts using string prefixes.
This ADR defines an address format for all addressable SDK accounts. That includes: new public key algorithms, multisig public keys, and module
accounts.

## Context

Issue [\#3685](https://github.com/cosmos/cosmos-sdk/issues/3685) identified that public key
address spaces are currently overlapping. One initial proposal was extending the address length and
adding prefixes for different types of addresses.
address spaces are currently overlapping. We confirmed that it significantly decreases security of Cosmos SDK.


### Problem

An attacker can control an input for an address generation function. This leads to a birthday attack, which significantly decreases the security space.
To overcome this, we need to separate the inputs for different kind of account types:
a security break of one account type shouldn't impact the security of other account type.
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved


### Initial proposals

One initial proposal was extending the address length and
adding prefixes for different types of addresses.

@ethanfrey explained an alternate approach originally used in https://github.com/iov-one/weave:

> I spent quite a bit of time thinking about this issue while building weave... The other cosmos Sdk.

> Basically I define a condition to be a type and format as human readable string with some binary data appended. This condition is hashed into an Address (again at 20 bytes). The use of this prefix makes it impossible to find a preimage for a given address with a different condition (eg ed25519 vs secp256k1).

> This is explained in depth here https://weave.readthedocs.io/en/latest/design/permissions.html

> And the code is here, look mainly at the top where we process conditions. https://github.com/iov-one/weave/blob/master/conditions.go

And explained how this approach should be sufficiently collision resistant:
> Yeah, AFAIK, 20 bytes should be collision resistance when the preimages are unique and not malleable. A space of 2^160 would expect some collision to be likely around 2^80 elements (birthday paradox). And if you want to find a collision for some existing element in the database, it is still 2^160. 2^80 only is if all these elements are written to state.

> Yeah, AFAIK, 20 bytes should be collision resistance when the preimages are unique and not malleable. A space of 2^160 would expect some collision to be likely around 2^80 elements (birthday paradox). And if you want to find a collision for some existing element in the database, it is still 2^160. 2^80 only is if all these elements are written to state.
> The good example you brought up was eg. a public key bytes being a valid public key on two algorithms supported by the codec. Meaning if either was broken, you would break accounts even if they were secured with the safer variant. This is only as the issue when no differentiating type info is present in the preimage (before hashing into an address).

> I would like to hear an argument if the 20 bytes space is an actual issue for security, as I would be happy to increase my address sizes in weave. I just figured cosmos and ethereum and bitcoin all use 20 bytes, it should be good enough. And the arguments above which made me feel it was secure. But I have not done a deeper analysis.

In discussions in [\#5694](https://github.com/cosmos/cosmos-sdk/issues/5694), we agreed to go with an
approach similar to this where essentially we take the first 20 bytes of the `sha256` hash of
the key type concatenated with the key bytes, summarized as `Sha256(KeyTypePrefix || Keybytes)[:20]`.
This lead to the first proposal (which we proved to be not good enough):
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved
we take the first 20 bytes of the `sha256` hash of the public key we concatenated with the key bytes, summarized as `sha256(keyTypePrefix || keybytes)[:20]`.
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved


### Review and Discussions

In [\#5694](https://github.com/cosmos/cosmos-sdk/issues/5694) we discussed various solutions.
We agreed that 20 bytes it's not future proof, and extending the address length is the only way to allow addresses of different types, various signature types, etc.
This disqualifies the initial proposal.

In the issue we discussed various modifications:
+ Choice of the hash function.
+ Move the prefix out of the hash function: `keyTypePrefix || sha256(keybytes)[:20]` [post-hash-prefix-proposal].
+ Use double hashing: `sha256(keyTypePrefix || sha256(keybytes)[:20])`.
+ Increase to keybytes hash slice from 20 byte to 32 or 40 bytes. We concluded that 32 bytes, produced by a good hash functions is future secure.

### Requirements

+ Support currently used tools - we don't want to break an ecosystem, or add a long adaptation period.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be more explicit? What tools in particular, is the goal here to just be keeping secp key formats as is?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly wallets or services which already relay on the current addresse. More details about this is at #8041 I will link it directly in the ADR.

+ Try to keep the address length small - addresses are widely used in state, both as part of a key and object value.


### Scope

This ADR defines only an address bytes. The for the API level we already use bech32 and this ADR doesn't change that.
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved
Bech32 support checsum error codes and handles user typos.
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved


## Decision

### Legacy Public Key Addresses Don't Change

`secp256k1` and multisig public keys are currently in use in existing Cosmos SDK zones. They use the following
address formats:
Currently (Jan 2021), the only officially supported SDK user accounts are `secp256k1` basic accounts and legacy amino multisig.
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved
They are used in existing Cosmos SDK zones. They use the following address formats:

- secp256k1: `ripemd160(sha256(pk_bytes))[:20]`
- legacy amino multisig: `sha256(aminoCdc.Marshal(pk))[:20]`
Expand All @@ -56,42 +89,90 @@ The current multisig public keys use amino serialization to generate the address
those public keys and their address formatting, and call them "legacy amino" multisig public keys
in protobuf. We will also create multisig public keys without amino addresses to be described below.

### Hash Function Choice

We propose to use [blake2b](https://www.blake2.net/) as a hash function choice:
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved
+ The main arguments are speed and separating from `sha256` which is widely used
by miners and could potentially be used to find collisions.
+ The function was in the final round of the 2012 NIST hash function competition.
+ It's well studied with security covered in many academic papers.
+ Faster than `sha2` on non ASICs chipsets.
+ It's getting more traction in other blockchains (Pokadot, Sia, Zcash, ...). Related [zcash discussion](https://github.com/zcash/zcash/issues/706#issuecomment-187807410).
+ It's already widely supported by all major programming languages.
+ Cryptography consulting reviled no argument against `blake2b`


### Base Address Algorithm

We start with defining a base algorithm for generating addresses. Notably, it's used for Base Accounts (accounts represented by a single key-pair) addresses. For each Public Key schema we need to have an associated `typ` string, which we will discuss in a section below. `hash` is a cryptographic hash function defined in the previous section.

```go
const A_LEN = 32
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved

func BaseAddress(typ string, pubkey []byte) []byte {
return hash(hash(typ) + pubkey)[:A_LEN]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not do a personalization string if the underlying hash supports it?

Its also worth noting, that for efficiency, hash(typ) should ideally have length block size OR size(hash(type) + pubkey) < 1 block size

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow you:

  • What do you mean with _ personalization string_?
  • The returned bytes are capped at A_LEN. hash is defined in the section above. I specifically, didn't want to name it here -- in case we will edit it, we only need to change one place.
  • With block size - you mane a blockchain block size? That's totally different scale.

}
```

The `+` is a bytes concatenation, which doesn't use any separator.
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved

### Canonical Address Format
This algorithm is an outcome after a consulting session with a cryptographer.
Motivation: this algorithm keeps the address relatively small (length of the `typ` doesn't impact on the length of the final address)
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved
and it's more secure than [post-hash-prefix-proposal] (with reducing the pubkey hash to 20 bytes, we essentially significantly reduce the address space).
Moreover the cryptographer motivated the choice to add `typ` in the hash to protect against switch table attack.
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved

We have three types of accounts we would like to create addresses for in the future:
- regular public key addresses for new signature algorithms (ex. `sr25519`).
- public key addresses for multisig public keys that don't use amino encoding
- module accounts: basically any accounts which cannot sign transactions and
which are managed internally by modules
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved

To address all of these use cases we propose the following basic `AddressHash` function,
based on the discussions in [\#5694](https://github.com/cosmos/cosmos-sdk/issues/5694):
### Composed Account Address Algorithm

We will generalize `BaseAddress` algorithm to define an address for an account which is represented by a set of sub accounts (example: group module accounts, multisig acconts...).
The address is constructed by recursively creating addresses for the sub accounts, sorting the addresses and composing it into a single address:

```go
func AddressHash(prefix string, contents []byte) []byte {
preImage := []byte(prefix)
if len(contents) != 0 {
preImage = append(preImage, 0)
preImage = append(preImage, contents...)
}
return sha256.Sum256(preImage)[:20]

type Acc interface {
Typ() string
SubAccounts() []Acc
...
}
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved

type BaseAccount interface {
Acc
PubKey() crypto.PubKey
}

func Address(acc Acc) []byte {
typ := acc.Typ()
if acc is BaseAccount {
return BaseAddress(typ, acc.PubKey())
}
subacconts := acc.SubAcconts()
alessio marked this conversation as resolved.
Show resolved Hide resolved
addresses := map(subaccount, Address)
addresses = sort(addresses)
n := len(addresses) - 1

return BaseAddress(typ, addresses[0] + ... + addresses[n])
}
```

`AddressHash` always take a string `prefix` as a starting point which should represent the
type of public key (ex. `sr25519`) or module account being used (ex. `staking` or `group`).
For public keys, the `contents` parameter is used to specify the binary contents of the public
key. For module accounts, `contents` can be left empty (for modules which don't manage "sub-accounts"),
or can be some module-specific content to specify different pools (ex. `bonded` or `not-bonded` for `staking`)
or managed accounts (ex. different accounts managed by the `group` module).
Implementation Tip: `Acc` implementations should cache address in their attributes.

In the `preImage`, the byte value `0` is used as the separator between `prefix` and `contents`. This is a logical
choice given that `0` is an invalid value for a string character and is commonly used as a null terminator.

### Canonical Public Key Address Prefixes
### Native Composed Accounts

All public key types will have a unique protobuf message type such as:
For accounts with a well specified public key composed of other public keys (various algorithms for aggregated signatures),
we will use a public key defined by the composition algorithm and we will call it _composed pubkey_. Example: BLS multisig.
The address algorithm for such accounts is same as the `BaseAccount`. In the example below, `na` is an object representing a native composed account.

```
na.address = BaseAddress(na.typ, na.ComposedPubKey)
```

### Account Types

The Account Types used in various account classes SHOULD be unique for each class.
Since both public keys and accounts are serialized in the state, we propose to use the protobuf message name string (`proto.MessageName(msg)`).

Example: all public key types have a unique protobuf message type similar to:

```proto
package cosmos.crypto.sr25519;
Expand All @@ -100,69 +181,35 @@ message PubKey {
bytes key = 1;
}
```

All protobuf messages have unique fully qualified names, in this example `cosmos.crypto.sr25519.PubKey`.
These names are derived directly from .proto files in a standardized way and used
in other places such as the type URL in `Any`s. Since there is an easy and obvious
way to get this name for every protobuf type, we can use this message name as the
key type `prefix` when creating addresses. For all basic public keys, `contents`
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved
should just be the raw unencoded public key bytes.

Thus the canonical address for new public key types would be `AddressHash(proto.MessageName(pk), pk.Bytes)`.

### Multisig Addresses

For new multisig public keys, we define a custom address format not based on any encoding scheme
(amino or protobuf). This avoids issues with non-determinism in the encoding scheme. It also
ensures that multisig public keys which differ simply in the ordering of keys have the same
address by sorting child public keys first.

First we define a proto message for multisig public keys:
```proto
package cosmos.crypto.multisig;

message PubKey {
uint32 threshold = 1;
repeated google.protobuf.Any public_keys = 2;
}
```

We define the following `Address()` function for this public key:

```
func (multisig PubKey) Address() {
// first gather all the addresses of each nested public key
var addresses [][]byte
for key := range multisig.Keys {
addresses = append(joinedAddresses, key.Address())
}

// then sort them in ascending order
addresses = Sort(addresses)

// then concatenate them together
var joinedAddresses []byte
for addr := range addresses {
joinedAddresses := append(joinedAddresses, addr...)
}

// form the string prefix from the message name (cosmos.crypto.multisig.PubKey) and the threshold joined together
prefix := fmt.Sprintf("%s/%d", proto.MessageName(multisig), multisig.Threshold)

// use the standard AddressHash function
return AddressHash(prefix, joinedAddresses)
}
```

## Consequences

### Positive
- a simple algorithm for generating addresses for new public keys and module accounts

- a simple algorithm for generating addresses for new public keys, complex accounts and module accounts
- the algorithm generalizes for _native composed keys_
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved
- increase security and collision resistance of addresses
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved
- the approach is extensible for future use-cases - one can use shorter addresses (>20 and < 32) for other use-cases.

### Negative

- addresses do not communicate key type, a prefixed approach would have done this
- addresses are 60% longer and will consume more storage space
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved

### Neutral
- protobuf message names are used as key type prefixes


## References

* [Notes](https://hackmd.io/_NGWI4xZSbKzj1BkCqyZMw) from consulting meeting with [Alan Szepieniec](https://scholar.google.be/citations?user=4LyZn8oAAAAJ&hl=en).
robert-zaremba marked this conversation as resolved.
Show resolved Hide resolved
* Blake2b security analysis: [1](https://eprint.iacr.org/2013/467), [2](https://eprint.iacr.org/2014/1012), [3](https://eprint.iacr.org/2015/515).