Skip to content

Latest commit

 

History

History
113 lines (77 loc) · 11.1 KB

fip-0027.md

File metadata and controls

113 lines (77 loc) · 11.1 KB
fip title author discussions-to Status type category created spec-sections
0027
Change type of DealProposal Label field from a (Golang) String to a Union
Laudiacay (@laudiacay), Steven Allen (@Stebalien), Aayush Rajasekaran (@arajasek)
Accepted
Technical
Core
2021-09-29
specs-actors

FIP-0027: Change type of DealProposal Label field from a (Golang) String to a Union

Simple Summary

Makers of deals on Filecoin can specify a "label" for their deals. This label is stored on the Filecoin blockchain, so it is important that there be no risk that a "bad" label can cause issues in nodes on the Filecoin network. Today, Filecoin does not enforce that this label meet UTF-8 encoding, even though valid CBOR-encoded strings must be UTF-8. It is also increasingly the case that systems assume all strings are UTF-8 (see here for more). This FIP proposes making a change to remove this abnormality.

Abstract

The market state's DealProposal's Label field is currently a String that is not enforced to be UTF-8. This is outside the CBOR string specification, and can also be a source of bugs and difficulties in client implementations due to variations in String libraries. Rather than enforce it as UTF-8 everywhere, this FIP changes the type to be a Union type that can be either Strings or raw bytes. This meets the goal of CBOR compliance and safety, while still allowing for users to have arbitrary bytes as the Label.

Change Motivation

Discussion is here and here

Summary: @ec2 noted that the non-UTF8 strings in this field are causing problems for the client implementations like ChainSafe. @ribasushi noticed that the Label field takes user input more or less directly, so enforcing UTF-7 might be difficult and pointless. Finally, @mikeal posted notes from an IPLD call saying that the source of the problem is that some programming languages enforce UTF-8 on Strings, while others don't, so the most widely compatible type for any on-chain data like this would be to just use a byte array.

Specification

The proposal is to change the Label to be a Union type, with the following IPLD schema:

type Label union {
  | String string
  | Bytes bytes
} representation kinded

String-type labels will be enforced to be UTF-8. An analogous type can be seen here, with its encoding found here.

Design Rationale

For simplicity and largest possible user-functionality, the ideal design would have just been to use a byte array. However, naively changing the Label field in the DealProposal from a string to bytes changes the serialization of the DealProposal. The current dealmaking flow on Filecoin is as follows (some aspects of this flow aren't protocol-specific, but might as well be, because they reflect the reality of dealmaking on Filecoin today):

  1. Client creates a DealProposal, signs it, wraps the proposal and signature into a ClientDealProposal, and sends that to the Storage Provider (SP)
  2. SP deserializes the ClientDealProposal, validates the signature, adds it to a batch, and eventually PublishStorageDeals (sending the serialized ClientDealProposal to the chain)
  3. Validators of the Filecoin blockchain:
  • deserialize the ClientDealProposal into a DealProposal and a signature
  • serialize the DealProposal
  • validate that the signature is valid for the serialized DealProposal

The issue is that all the de/serializaton steps fail if they were done on opposite sides of the switch from string to bytes. It is not clear when clients should start using bytes instead of strings, or when SPs should start expecting clients to be using bytes instead of strings. Further, all PublishStorageDeals messages in node message pools (waiting to be included in a block) at the time of the network upgrade will fail when they land on chain.

The Union type solves most of the problems described above. A large refactor may still be needed for client implementations if they don't abstract over possible DealProposal, but we avoid a messy period over the upgrade where large swathes of PublishStorageDeals messages fail on-chain and large numbers of clients are seeing SPs reject their deal proposals as invalid.

Backwards Compatibility

A migration is needed in the network upgrade that introduces this FIP. The migration runs over all DealProposals on chain, applying the following rule:

  • If Label is UTF-8, use the string type of the Union
  • Else, use the bytes type.

Note that this means that most deals will actually not have their encodings change at all, since most deals on Filecoin today do have UTF-8 encoded Labels.

Test Cases

Test that:

  • New deals can be created with UTF-8 string labels, and can be encoded and decoded
  • New deals can be created with byte string labels, and can be encoded and decoded
  • New deals CANNOT be created with non-UTF-8 string labels

Additionally, migration testing should ensure that:

  • Existing deals with UTF-8 Labels are migrated to Unions with strings, and the CBOR encoding of such deals DO NOT change
  • Existing deals with non-UTF-8 Labels are migrated to Unions with bytes, and the CBOR encoding of such deals do change

Security Considerations

Implementations that were previously validating that Label strings were valid UTF-8 should validate where else they might be parsed or used. Checking UTF-8 encoding could potentially reject an otherwise-potentially-dangerous attack payload that could exploit other code that interacts with this data.

Incentive Considerations

This should have a positive effect on reliable and useful storage, because implementations of the protocol will be more correct and easier to maintain.

Product Considerations

This should have a positive effect on the ecosystem as a whole, because implementations of the protocol will be more correct and easier to maintain.

Implementation

A prototype of this implementation on top of the v7 actors release can be seen here. The actual implementation will have to be on a new version (v8) of actors, but this is useful to test backwards compatibility.

Future considerations

In a subsequent network upgrade, we can consider a FIP that replaces this type with simple byte strings (the ideal design described in the Design Rational section). With sufficient communication, we can announce that support for String labels is being deprecated ahead of time, allowing all users to start using byte strings instead.

Copyright

Copyright and related rights waived via CC0.