Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: update IPNS specification #319

Merged
merged 13 commits into from
Oct 19, 2022
Merged

feat: update IPNS specification #319

merged 13 commits into from
Oct 19, 2022

Conversation

lidel
Copy link
Member

@lidel lidel commented Sep 5, 2022

Closes #314

This PR aims to document implementable IPNS Record creation and validation.

I've documented IPNS record creation and verification based on what latest Kubo and JS-IPFS do, but without v1 signatures (as we want to remove support for them anyway).

Would appreciate additional 👁️ from implementers and people working with IPNS.

TODO

  • Describe IPNS Name
    • explain what libp2p-key is and how tis protobuf looks like
    • explain when inlining makes sense
    • document String representation as CID
  • Describe IPNS Record protobuf and DAG-CBOR part
    • extensible record data in strict DAG-CBOR
    • signatures v1 creation
    • signatures v2: creation and validation
  • Describe IPNS Keys section
  • reference DHT and PubSub specs
  • backward-compatibility policy
  • go over additional asks listed in Update IPNS specification #314

Open questions

  • move everything to ipns/ ?
  • is it ok to only require Ed25519?
    • it is MUST, we also have RSA which is 'SHOULD'
  • is it ok to state that IpnsEntry.data DAG-CBOR (to avoid duplicating strictness spec from https://ipld.io/specs/codecs/dag-cbor/spec/)
  • is it ok to simply not document v1 signatures?
    • yes, this is legacy which we want to go away (only relevant to Kubo and JS-IPFS)
  • if we dont support v1, do we need to duplicate value, validity, validityType, sequence, and ttl (as IpnsEntry.* and in CBOR ar IpnsEntry.data)?

cc @2color @mrodriguez3313 @TMoMoreau @rvagg @vasco-santos

@lidel lidel requested review from Stebalien, rvagg, vasco-santos and a team September 5, 2022 15:47
@Winterhuman
Copy link

Just to confirm, the spec states "Before parsing protobuf, confirm IpnsEntry size is less than 2 MiB", however, will IPFS implementations (specifically Kubo) likely only support smaller IPNS records? Is there a minimum max size of IPNS record that all implementations must support?

IPNS.md Outdated Show resolved Hide resolved
@lidel
Copy link
Member Author

lidel commented Sep 5, 2022

@Winterhuman afaik there is no explicit max size of IPNS record defined anywhere, but for sure big record wil lbreak something.
That is why I added this explicit number to this spec (so we can implement the same thing across implementations).

The only reason I picked 2 MiB is to be in the ballpark of the bitswap block limit (ipfs/kubo#8968)

@Winterhuman
Copy link

Ah I see! I do think a minimum max size is needed, it'd be nice to have a safe amount of data that can be inlined into IPNS records (by Value and/or by data) without any risk of being rejected by specific implementations, even like 4KiB or something would be good just so it's a definitive spec-defined size

@lidel lidel self-assigned this Sep 6, 2022
@lidel
Copy link
Member Author

lidel commented Sep 9, 2022

Pushed some changes, expanding separate sections about "IPNS Name" and "IPNS Keys", formally decoupling IPNS spec from PeerIDs.

Copy link
Contributor

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍 at this writeup. I didn't read through things too precisely for correctness, but left a few comments that might be useful.

IPNS.md Outdated

Taking into consideration a p2p network, each peer should be able to publish IPNS records to the network, as well as to resolve the IPNS records published by other peers.
- The maximum size of `IpnsEntry` is 2 MiB. Bigger records MUST be ignored by IPNS implementations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in #273 (comment) 2MiB seems quite large (and won't work with IPNS over PubSub). If data is ever super large it could probably just point at a CID or another IPNS record instead.

Some ways it might be reasonable to establish a magic number here if we don't want to just spitball a number (which is also doable)

  • What do records in the network currently look like? (e.g. use Hydras to analyze stored records)
  • Look at something like a "maximum CID size" and add some more to it (e.g. double it). The indexer folks could probably give some insight here as would the related issues on this topic. Blocking on people committing to a maximum CID size seems unwise since that could take a while/not happen so we'd have to guesstimate here
  • What's the longest path we've seen show up on ipfs.io
  • Max public key size
    • Probably a 4096 RSA key, but you could ask the libp2p folks.
    • If people have really large keys we could either change the protocol to go back to having people find them separately from the records, or keep their values smaller to tradeoff against their key size and push towards smaller key types

Aside: it might also be worth considering switching the wording here to be similar to how Bitswap has block sizes specified.

Bitswap implementations must support sending and receiving individual blocks of sizes less than or equal to 2MiB. Handling blocks larger than 2MiB is not recommended so as to keep compatibility with implementations which only support up to 2MiB.

It's a bit of a different take, but could be useful to document (even just on GitHub) why we chose one approach over the other.

Copy link
Member Author

@lidel lidel Sep 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things are more complicated: the introduction of the extensible IpnsEntry.data CBOR field means some people and services WILL start putting additional metadata there, and that means all bets and estimates will be off.

Still, we need to create some arbitrary headroom. It is easier to increase the limit than to decrease it.
I've switched to max to 10 kiB as it is closer to realistic needs, and adjusted the language to match bitswap one (see 1baf069).

Will try to gather some data from hydras, but in the meantime, @alanshaw / @vasco-santos do you have any metrics around record sizes on w3name? or thoughts on this matter?

Copy link
Member Author

@lidel lidel Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some quick numbers from hydras (records that are currently valid)

Total DynamoDB Records: 256697
  ipns: 12287
  pk:   244410

DynamoDB Record Value Sizes:
  avg:    299.62269913555673
  std:    45.88040769545226
  median: 299
  max:    2033
  min:    36

PK Records Key Types:
  rsa:       244403
  ed25519:   3
  secp256k1: 0
  ecdsa:     4

And just the extensible data field

IPNS 'IpnsEntry.Data' Size:
  avg:    139.27228998453873
  std:    9.81409166725822
  median: 141
  max:    182
  min:    91

10 KiB sounds ok, leaving some room for growth and experimentation without being blocked by implementations that follow this spec, but lmk if we should adjust.

IPNS.md Outdated
Comment on lines 43 to 46
Implementations MUST support Ed25519 with signatures defined in [RFC8032](https://www.rfc-editor.org/rfc/rfc8032#section-5.1).

Implementations MAY support RSA, Secp256k1 and ECDSA for private use, but peers
from the public IPFS swarm and DHT may not be able to resolve IPNS records
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems mostly reasonable as this copies from the libp2p peerID spec, however RSA keys might still be very common since they used to be the default in implementations like kubo (formerly go-ipfs) and js-ipfs. This is the same as in the libp2p world, but given we're not just referencing that spec directly figured I'd call it out here in case anyone comes looking.

It'd be nice if we could post some numbers here (on GitHub) coming from the Hydra nodes on the distribution of key types justifying that Ed25519 covers most usage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will try to get % usage of Ed25519, RSA, Secp256k1 and ECDSA and get back to this with numbers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some quick numbers from hydras (records that are currently valid)

Total DynamoDB Records: 256697
  ipns: 12287
  pk:   244410

DynamoDB Record Value Sizes:
  avg:    299.62269913555673
  std:    45.88040769545226
  median: 299
  max:    2033
  min:    36
IPNS Records Signatures:
  v1:     12287
  v2:     6892
  with pk: 913
PK Records Key Types:
  rsa:       244403
  ed25519:   3
  secp256k1: 0
  ecdsa:     4

ed25519 is used by default and inlined, "3" is probably someone writing own IPNS tool and embedding it the same way RSA - can be ignored.

“with pk” means IPNS records with IpnsEntry.pubKey present.
RSA is dominating here, and used in ~7% of cases.

I think Ed25519 being ~93% and RSA ~7% is a good argument to keep Ed25519 as MUST and change RSA to SHOULD (same as in PeerID spec).

I adjusted language in 41612da

IPNS.md Outdated Show resolved Hide resolved
IPNS.md Outdated Show resolved Hide resolved
IPNS.md Outdated Show resolved Hide resolved
IPNS.md Outdated Show resolved Hide resolved
@lidel lidel force-pushed the chore/update-ipns-specification branch 2 times, most recently from 8959d27 to b926e8c Compare September 14, 2022 13:03
IPNS.md Outdated
Comment on lines 226 to 229
- A legacy publisher MUST always be able to update to the latest implementation
of this specification without breaking record resolution for legacy consumers.
- A legacy consumer MUST always be able to resolve IPNS name, even when publisher
updated to the latest implementation of this specification.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some DHT stats

We've looked at some stats from Hydra nodes (a small sample of recently valid records):

IPNS Records Signatures:
  v1:       9513
  v2:       5821
  with pk:  733 (RSA)
  nil data: 3692 (V1-only)

Every V2 record is also V1 (V1-only records have no data field, so their count is in "nil data").
This means for the sample size of 9513 we have ~61% of V2+V1 and ~38% V1-only.
We also looked at other time-ranges and V2 is growing, V1-only is decreasing – seems to follow the usual slow update curve we also see for Kubo itself.

On this upgrade strategy

It is reasonable to expect both clients and publishers will update over time,
but the act of updating a publisher should not introduce any record resolution regressions for legacy nodes.

To make this clear, I've documented backward compatibility policy in b926e8c

  • A legacy publisher MUST always be able to update to the latest implementation
    of this specification without breaking record resolution for legacy consumers.
  • A legacy consumer MUST always be able to resolve IPNS name, even when publisher
    updated to the latest implementation of this specification.

This policy ensures the spec is aligned with existing code that was created to ensure IPNS implementations in GO and JS are backward-compatible from the perspective of legacy consumers:

  • both Kubo and js-ipfs produce V1+V2 records
    • This allows legacy go-ipfs <0.9.0 node to correctly resolve IPNS records published by modern Kubo 0.15+
    • Maintaining this contract is important, I've added it to this spec
  • Old go-ipfs <0.9.0 node publishing legacy RSA+V1-only records should be able to upgrade to Kubo 0.16 and start publishing V1+V2 records, allowing for V2 resolution on modern clients (but not break V1-only for legacy ones).

lidel added a commit to ipfs/go-ipns that referenced this pull request Sep 20, 2022
achingbrain pushed a commit to ipfs/js-ipfs that referenced this pull request Sep 22, 2022
Co-authored-by: achingbrain <[email protected]>

js-ipfs will now ignore V1 signatures when validating IPNS records, and always require V2.

See ipfs/js-ipns#180 for low level details (needs that to be released before merging this)

Closes #4197

BREAKING  CHANGE: IPNS Records that do not have V2 but have V1 signature will no longer pass validation, even if V1 is correct. V2 is mandatory to pass validation. See "Record validation" in ipfs/specs#319 for details.
IPNS.md Outdated
## Overview
- A legacy publisher MUST always be able to update to the latest implementation
of this specification without breaking record resolution for legacy consumers.
- A legacy consumer MUST always be able to resolve IPNS name, even when publisher

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does "resolve IPNS" name mean they should be able to validate the signature? Or just that they can read the record?
Isn't that already not the case since supporting all Key types isn't required there's already consumers that can't validate all signtures? And new key types (or signing prefixes) would invalidate old consumers?

Copy link
Member Author

@lidel lidel Oct 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does "resolve IPNS" name mean they should be able to validate the signature?

@paulgmiller Yes, IPNS resolution includes signature verification step: records that have invalid signatureV2 are ignored by implementations like Kubo or js-ipfs.

And new key types (or signing prefixes) would invalidate old consumers?

The use case we are protecting here is preexisting IPNS client resolving preexisting IPNS name for mission-critical updates.

In that case, the key type will not change, records are published with the same key, and the old signature must be present next to the new one to ensure the name still resolves for the legacy client.

An example scenario is as follows:

  • IoT device implemented minimal support for RSA and V1, and uses it for polling for new firmware updates.
  • It retrieves record with V1+V2 signatures, validates V1 and fetches new firmware
  • The new firmware has support for IPNS V2, and uses a new IPNS name for polling for updates (requires V2 signatures, and can also use a different key type like Ed25519).

- IPNS Name and Record protobufs, ascii and logical structs
- extensible record data in strict CBOR
- signatures v2: creation and validation
this removes any confusion around V1 and V2
The size is not mentioned once, making it easier to change in the
future.
We have to document legacy signature, because it is the only reason for
copying 'data' fields into the main protobuf.

Added section about backward compatibility so our approach is more clear.
@lidel lidel force-pushed the chore/update-ipns-specification branch from 0e8f02e to 7a7163e Compare October 19, 2022 21:52
Gently point at new location (until we clean up docs.ipfs.tech)
@lidel lidel force-pushed the chore/update-ipns-specification branch from 7a7163e to 0453b84 Compare October 19, 2022 21:55
@lidel
Copy link
Member Author

lidel commented Oct 19, 2022

Merging since the hardened IPNS record verification shipped in:

@lidel lidel merged commit e372b24 into main Oct 19, 2022
@lidel lidel deleted the chore/update-ipns-specification branch October 19, 2022 22:03
Jorropo pushed a commit to ipfs/go-libipfs-rapide that referenced this pull request Mar 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Archived in project
Status: Done
Development

Successfully merging this pull request may close these issues.

Update IPNS specification
4 participants