Skip to content

Latest commit

 

History

History
472 lines (316 loc) · 31.7 KB

fip-0048.md

File metadata and controls

472 lines (316 loc) · 31.7 KB
fip title author discussions-to status type category created spec-sections
0048
f4 Address Class
Steven Allen <[email protected]>, Melanie Riise <[email protected]>, Rául Kripalani <[email protected]>
Final
Technical
Core
2022-10-02

FIP-0048: f4 Address Class

Simple Summary

This FIP adds an extensible address class, f4, to support the creation of new user-defined actor addressing schemes. Specifically, the actor at {some-actor-id} (e.g., 123) will manage all addresses starting with f4{some-actor-id}f (f4123f).

Additionally, this FIP adds the ability to send funds to such an address before deploying an actor there. As of today, it's possible to send funds to an account (f1 or f3) that doesn't yet exist on-chain, but there's no way to send funds to a non-account actor (e.g., a multisig actor with an f2 address) that doesn't yet exist on-chain.

This will allow users to:

  • Implement foreign addressing systems (e.g., the one used in Ethereum) in Filecoin.
  • Implement predictable addressing systems so that an actor's address may be computed before deployment.
  • Send funds to such a pre-computed actor address before actually deploying the actor there.

NOTE: While the f4 address class is designed and intended to be managed by user-defined actors, this FIP currently restricts f4 assignment to specific "blessed" address managers. This restriction will be relaxed in a future FIP once users are allowed to deploy native WebAssembly actors.

Abstract

Currently, adding new address types and address derivation methods to the Filecoin network requires extensive changes and a network upgrade. This FIP introduces a new user-extensible address class such that new addressing systems can be added by users without extensive changes and network upgrades.

It achieves this by adding a new address class (f4) where user-definable "address management" actors will be able to create new actors under an address-manager specific f4 prefix (specifically, f4{address-manager-actor-id}f). This will allow Filecoin to support the native addressing schemes of foreign runtimes like the EVM and create new predictable addressing scheme to support interactions (both actually and counterfactually) with addresses that do not yet exist on-chain.

Specifically, this FIP:

  1. Creates a new extensible address class (f4) to support new addressing schemes.
  2. Allows funds to be sent to an f4 address before an actor has been deployed to the address.
  3. Extends the Init actor with a new Exec4 method for creating actors with specific f4 addresses.
  4. Extends ActorState objects to include the f4 address, if assigned.
  5. Adds a new lookup_delegated_address syscall to the FVM for looking up an actor's f4 address, if assigned.

Change Motivation

There are two primary motivations for this FIP:

  1. While Filecoin currently allows users to interact with and/or predict the addresses of account actors (f1 and f3) that do not yet exist on-chain, it doesn't currently allow users to interact with and/or predict the addresses of non-account actors (f2) before deployment. This FIP supersedes Predictable actor address generation for wallet contracts, state channels (aka CREATE2).
    • This functionality is important for custom wallet types where a user may want to generate a wallet's address before deploying the wallet (e.g., because the wallet may never actually receive funds).
    • This functionality is especially important for state-channel applications where a state-channel may need to be able to refer to actors that, in the "optimistic" case, will never actually be deployed on-chain.
  2. The FVM aims to be a hypervisor with first-class support for runtimes like the EVM but to achieve full compatibility, the FVM must support the addressing schemes used by said runtimes (e.g., Ethereum addressing) as said runtimes often rely on particular address formats and derivation logic.
    • EVM tooling expects Ethereum addresses.
    • Tools that interact with EVM actors counterfactually expect the addresses of said actors to be derived via the precise method specified in the CREATE2 opcode.
    • Tools like, e.g., Metamask will derive Ethereum addresses for wallets, not Filecoin addresses, and converting between the two is impossible due to the hash functions involved.

This FIP achieves this by:

  1. Providing a way to create new "predictable" addressing schemes that will allow users to pre-compute the address at which some actor could be deployed on-chain, without forcing them to actually deploy it (yet).
  2. More generally, allowing users to deploy arbitrary user-defined addressing schemes.

Specification

This FIP adds a new extensible address class (f4) where user-defined address management actors will be able to create new actors and assign them f4 addresses starting with the address manager's actor ID.

For now, this FIP proposes that address management actors be limited to built-in "singleton" actors, but expect users to be able to deploy their own address management actors in the future.

There are four key changes proposed in this design:

  1. A new address class (f4). Tooling will need to be adapted to parse and handle said addresses.
  2. A new lookup_delegated_address syscall to retrieve an actor's f4 address and a new delegated_address field in the ActorState object to record an actor's f4 address.
  3. A new "placeholder" actor.
  4. A new Exec4 method on the init actor to support the f4 address class.

NOTE: For now, f4 addresses may only be assigned by and in association with specific "blessed" address managers, the first of these will likely be the "Ethereum Address Manager (EAM)". Once users are able to deploy custom WebAssembly actors, this restriction should be relaxed (in a future FIP).

Parameters

Parameter Value Rationale
MAX_SUBADDRESS_BYTES 54 fits in 64 bytes when encoded

The f4 Address Class

An address manager will own f4 addresses (in the binary representation) starting with the leb128-encoded actor ID and followed by an arbitrary (chosen by the address management actor, up to MAX_SUBADDRESS_BYTES bytes) sub-address: [0x4 (f4 address class)] || {leb128(actor-id)} || {sub-address}.

In text, this address will be formatted as f4{decimal(actor-id)}f{base32(sub-address || checksum)} where checksum is the blake2b-32 (32bit/4byte blake2b) hash of the address in its binary representation (protocol included). This is the same checksumming approach used in the textual representation of f1-f3 addresses.

For example, an address management actor at f010 will be able to assign addresses starting with f410- in text or [4, 10, ...] in binary. 1

NOTE: The textual format defined here is the universal textual format for f4 addresses. We expect that chain explorers and client implementations will understand specific well-known address types and will format these addresses according to their "native" representation. For example, tooling should transparently convert Ethereum addresses in the 0x... to and from the equivalent f4 address. 2

New lookup_delegated_address Syscall And State Changes

To support foreign address systems, the FVM must provide a way to lookup an actor's f4 address. This FIP proposes two changes to support this functionality:

  1. A new field in the actor state-root object to store an actor's f4 address, if assigned.
  2. A new syscall to retrieve this information.

The ActorState object currently has the following fields (encoded as a CBOR list):

pub struct ActorState {
   pub code: Cid,
   pub state: Cid,
   pub nonce: u64,
   pub balance: TokenAmount,
}

This FIP adds a fifth field, delegated_address:

pub struct ActorState {
   pub code: Cid,
   pub state: Cid,
   pub nonce: u64,
   pub balance: TokenAmount,
   pub delegated_address: Option<Address>, // NEW!
}

If an actor doesn't have an f4 address, this field will be a CBOR null. Otherwise, it will contain the actor's f4 address.

This FIP also proposes a new lookup_delegated_address syscall to retrieve this information:

mod actor {
    /// Looks up the "delegated" (f4) address of the target actor (if any).
    ///
    /// # Arguments
    ///
    /// `addr_buf_off` and `addr_buf_len` specify the location and length of the output buffer in
    /// which to store the address.
    ///
    /// # Returns
    ///
    /// The length of the address written to the output buffer, or 0 if the target actor has no
    /// delegated (f4) address.
    ///
    /// # Errors
    ///
    /// | Error           | Reason                                                           |
    /// |-----------------|------------------------------------------------------------------|
    /// | NotFound        | if the target actor does not exist                               |
    /// | BufferTooSmall  | if the output buffer isn't large enough to fit the address       |
    /// | IllegalArgument | if the output buffer isn't valid, in memory, etc.                |
    pub fn lookup_delegated_address(
        actor_id: u64,
        addr_buf_off: *mut u8,
        addr_buf_len: u32,
    ) -> Result<u32>;
}

Placeholder Actors

To support interactions with addresses that do not yet exist on-chain, the FVM must support sending funds to an address before an actor is deployed there. It's already possible to automatically create an account (f1 or f3) on send, but it's not possible to do the same for arbitrary actors.

First, we'll define a special "placeholder" actor that does nothing. The placeholder's implementation is simply:

#[no_mangle]
pub extern "C" fn invoke(_: u32) -> u32 {
    0
}

On send to an unassigned f4 address f4{actor-id}f{subaddress}, the FVM will:

  1. Validate that an actor with ID actor-id exists. For now, the actor-id will be further restricted to "blessed" address managers.
  2. Create a new "placeholder actor" to hold the received funds, assigning the f4 address to that new actor and recording the new actor's f4 address in the actor's ActorState object.

Sending to an unassigned f2 address will not be supported as f2 addresses are designed to protect against reorgs and aren't designed to for this use-case (see f2 versus f4).

Specifically, the FVM will:

  1. Create a new "placeholder" actor with the appropriate code CID, an empty state, zero nonce, and the target f4 address.
  2. Invoke the requested method on the placeholder (which always does nothing), depositing any value transferred.
  3. Assign the target f4 address to that actor (but don't yet assign an f2 address) by updating the Init actor's address map.

The placeholder actor will have the following state object serialized as a DagCBOR tuple:

ActorState {
    code: PLACEHOLDER_CODE_CID,
    state: EMPTY_ARRAY_CID,
    nonce: 0,
    balance: TokenAmount::zero(),
    delegated_address: f4_address
}

Exec4

To support the f4 address class, this FIP adds an Exec4 method to the Init actor to create new actors with an f4 address (in addition to an f2 address) under the caller's f4 sub-namespace. The f4 to actor-id to address mappings are stored alongside the f1-f3 to actor-id address mappings in the Init actor's address_map requiring no additional state or state migrations.

Specifically, given the parameters:

struct Exec4Params {
    subaddress: Bytes, // limited to MAX_SUBADDRESS_BYTES
    code: Cid,
    constructor_params: Bytes,
}

Exec4 will:

  1. Compute the f4 address as 4{leb128(caller-actor-id)}{subaddress} where {caller-actor-id} is the actor ID of the actor invoking the init actor and {subaddress} is the sub-address specified in the Exec4 parameters.
  2. If an actor exists at that address and it is not a placeholder, abort.
  3. If an actor does not exist at that address, create a new actor (with a new actor ID) and assign the f4 address to that actor.
  4. Assign an f2 (stable) address to the actor (the f4 address may not be reorg stable).
  5. Set the actor's code CID to code and invoke the actor's constructor.
  6. Finally, Exec4 will return the actor's ID and f2 (stable) address. It will not return the f4 address as the caller should be able to compute that locally.

NOTE: For now, Exec4 will only be callable by "blessed" address managers.

Design Rationale

This FIP meets the design goals by:

  • Supporting interactions with addresses that do not yet exist on-chain by allowing user-defined address management actors to assign addresses based on arbitrary properties.
  • Supporting arbitrary addressing schemes (e.g., Ethereum addressing) by carving out address "namespaces".

In this section, we discuss some of the tradeoffs and alternatives considered.

Init Actor State

As this FIP stores f4 addresses alongside f2 addresses in the same map, actors with f4 addresses will have two entries in this map. We could, alternatively, put these addresses in a separate map but the current FIP lets us avoid changing the Init actor's state layout (for now).

Hashing v. Prefixing

An alternative to prefixing (f4{actor-id}f{subaddress}) would be to define a namespace via a hash function. I.e., an actor would own f4{hash(actor-id || subaddress)}.

The primary benefit of this approach is that addresses have a predictable size. The primary downside is that:

  • This is slightly more expensive to compute and EVM actors will have to convert Ethereum addresses into f4 addresses on-chain.
  • It's not reversible. This makes tooling and debugging significantly more painful.

Control Inversion

An alternative to the factory approach would be to invert the flow and:

  1. Have the "creating" actor call Exec4 on the init actor with the address of the address manager.
  2. Have the init actor call some GenerateAddress method on the address manager.

However, the current approach gives the address manager more control as it effectively gets to act as a user-defined "init" actor, sitting between the user and the real init actor.

Aside explaining why we even considered this approach...

Currently (and unchanged by this proposal), new actors are constructed by the init actor as follows:

  1. The init actor creates a new actor object, assigns it a new ID, and hooks it into the state-tree.
  2. The init actor then calls method 1 on this new actor to "construct" it.

This means that an actor's constructor will always see the init actor as the "sender" and has no real way to know the "ultimate" actor that created it.

However, there have been discussions about replacing method 1 with a new top-level WebAssembly function, replacing step 2 above with a special privileged syscall to construct an actor. This would:

  1. Make the "creating" actor (not the init actor) appear as the sender inside the constructor.
  2. Ensure that actors can't be confused as to whether they're being invoked (via a message) or constructed.

Unfortunately, the actor factory approach proposed in this FIP means that the address manager would appear as the "sender", not the ultimate "creating" actor. On the other hand, we were unable to find any strong arguments for this being an issue in practice.

Bind4

The original proposal included a Bind4 method on the Init actor to bind an f4 address to an existing actor (allowing actors to have more than one f4 address). This would have allowed one to, for example:

  1. Receive funds at some f1 address (A) (e.g., from some existing Filecoin user).
  2. Receive funds at the equivalent Ethereum f4 address (B) (e.g., from some existing Metamask user).
  3. Later deploy a Filecoin account to A, then Bind4 B to said account, merging them into a single account.

Unfortunately, by step 3, both accounts would have been assigned separate f0 addresses. Merging these accounts would have required either removing or re-assigning one of the f0 addresses which would have violated existing assumptions on how ID addresses behave.

Textual Format

The universal textual format of f4 addresses, f4{decimal(actor-id)}f{base32(sub-address || checksum)}, is a trade-off between verbosity, readability, and determinism.

  • We split the actor-assigned "sub-address" with an f to make it possible to "read" the address management actor's ID without having to decode the address.
    • We could have encoded this address as f4{base32(leb-encoded-actor-id || sub-address || checksum)}, but that would have been impossible for a human to read.
    • We considered splitting the address with a dash (f4123-abcde), but @jbenet pointed out that dashes in identifiers have poor UX: they don't select as one word on double-click.
  • We chose to encode the actor-id as a decimal to mirror f0.
  • We chose to base32 encode the sub-address because it may be long. The main downside is that converting to/from an Ethereum address may require re-encoding from hex to base32 (but this point is somewhat moot given the checksum anyways).
  • We chose to require a specific checksum format and to not support multiple bases (e.g., multibase) in this address format to ensure that two f4 addresses are equivalent if and only if they have the same textual representation. This unfortunately means that converting to and from external addressing schemes like Ethereum/EVM addresses cannot be done by hand.

F2 Versus F4

This FIP makes a few interesting decisions with respect to f2 addresses:

  1. While it allows sending to unassigned f4 addresses, it does not allow the same for f2 addresses.
  2. It assigns both an f2 and an f4 address to new actors.
  3. It does not assign an f2 address until the actor actually exists on-chain.

The key distinction is that f2 addresses are designed to stable and that f4 addresses are designed to be "user-programmable".

  • An f2 address allows a user to create a chain of messages where a later message refers to an actor created in an earlier message. An f2 address refers to the actor created by a specific message.
  • An f4 address allows an actor (an address manager) to "control" an address-space. This allows the address manager to implement foreign addressing schemes (e.g., Ethereum's) and allows users to refer to addresses that could contain an actor with a set of properties enforced by an address manager.

Circling back to the design decisions:

  1. A send to an unassigned f2 address must fail because that means a prior message in a message chain failed for some reason and didn't create the expected actor. If the send were to succeed and create a placeholder (as happens with an unassigned f4 address) the funds would be lost forever.
  2. Assigning an f2 address along with an f4 address allows a message chain to safely refer to an actor deployed in a prior message even if a prior message in the chain fails because, e.g., someone else successfully deployed an actor at the target f4 address.
  3. We delay assigning an f2 address until an actor is deployed for the same reason.

Accepting/Rejecting Non-Zero Messages in the Placeholder Actor

The placeholder actor accepts all messages on all method numbers. This behavior matches the behavior of, e.g. Ethereum accounts.

We considered rejecting everything but simple value transfers (method 0) due to some concerns around re-orgs. For example:

  1. User U1 sends a message to create some actor with an f4 address A.
  2. Some user U2 transfers value to A with Deposite(some_address) method, attempting to deposit funds into an account identified by some_address.
  3. A chain re-org happens and the message from U2 gets executed before the message from U1.

Rejecting all non-zero methods would prevent this issue.

However, the primary purpose of a placeholder is to be a place to store value before an actor is actually deployed. Rejecting method invocations would, e.g., prevent placeholders from receiving NFTs the FRC-0053 universal receiver hook, receiving value from EVM calls, etc.

Recording F4 addresses in the ActorState root

Instead of recording f4 addresses in each actor's state-root, one could record this address in the actor's state itself (i.e., implement this in "userland"). However, this approach has some significant downsides:

First, the actor needs to somehow learn about its address, which means the address would likely need to be passed into the actor in the constructor parameters. If the address is derived from the constructor parameters, the actor may need to be constructed in multiple steps.

Second, learning an actor's f4 address would involve:

  1. Calling some well-known method to lookup the f4 address.
  2. Resolving that address back into an actor ID to make sure it actually belongs to the actor in question.

Recording other addresses in the ActorState root

A previous proposal suggested storing the f1 and f3 addresses in the actor state root object as well. Effectively, the delegated_address field would simply be an address field. However:

  1. In many ways, the f4 address class supersedes the f1 and f3 address classes. By only recording f4 addresses in this field, we leave room to eventually deprecate f1 and f3 addresses (assigning f4 addresses to all actors, even existing accounts). This also aligns well with the push towards account abstraction, because user-programmed accounts would likely have f4 addresses (and couldn't have f1 or f3 addresses).
  2. There's no easy-to-explain reason group f1, f3, and f4 addresses into a common "class" while leaving out f2. The real answer is that these (f1, f3, and f4) are are useful to know on-chain, while f2 addresses generally are not, but that doesn't make the situation any less confusing.

Backwards Compatibility

While this FIP aims to minimize backwards incompatible changes, it will require a state migration and will introduce new features and behavior changes:

  • A new delegated_address field will be added to all actor state-root objects. For all existing actors, this will be a single byte per actor to represent a CBOR null. To support this, the state-tree version will be bumped to version 5.
  • The address map in the init actor may now map multiple addresses to the same actor. Clients (especially dashboards) that aggregate/parse chain state will need to handle this case.
  • The code CID of a deployed actor may now change from the "placeholder" code CID to an actual code CID. Implementations that cache code CIDs will need to handle this case.
  • This FIP introduces a new address class (f4). Implementations of the Filecoin addressing standard will need to handle parsing of this new address class.

Test Cases

Address Format

TODO: ...

Behavior

TODO: Test vectors will be extracted from the reference implementation once completed.

Security Considerations

Address Stability

Unlike f1-f3 addresses, f4 addresses may not be stable across reorgs (depending on the address management actor). This could be used to confuse users who may be conditioned to assume "short addresses are unstable, long addresses are 'secure'".

The only real solution here is to educate users that "f4 means user defined".

Incentive Considerations

Actor Churn

By relying on ID addresses to distinguish between different f4 address sub-namespaces, this FIP may encourage ID address "churn" by those seeking "vanity" addresses.

However, we have mitigated one aspect of this: f4 addresses may not be used to reward ID address churn by deploying to an f4 address managed by a non-existent address management actor.

Product Considerations

There are two main product goals in this FIP:

  1. Support Ethereum tooling and the EVM. This will allow users to deploy user-defined actors on the Filecoin network without having to create an entirely new ecosystem, tooling, etc.
  2. Support interactions with addresses that do not yet exist on-chain (required for many counterfactual interactions).

Ethereum Support

In a future FIP, we will propose a Ethereum Address Manager (EAM) actor, likely at the address f032. This actor will be responsible for assigning all "Ethereum-style" addresses on the Filecoin network (EVM actor addresses and Ethereum account addresses).

Ethereum Addresses

With this FIP, the full Ethereum address would be preserved inside the f432f address sub-namespace making it easy to:

  1. Recognize Ethereum addresses (likely displaying them in their usual 0x... format).
  2. Convert back and forth between the Ethereum address format (0x...) and the Filecoin (f4...) format.

CREATE

To support CREATE1, the EAM will export a Create method taking:

struct CreateParams {
    initcode: Bytes,
}

It will then create an address per the Ethereum yellowpaper and call Init.Exec4 with:

Exec4Params {
    address: b"<computed address>",
    code: EVM_ACTOR_CID,
    params: initcode,
}

CREATE2

To support CREATE2, the EAM will export a Create2 method taking:

struct Create2Params {
    initcode: Bytes,
    salt: [u8; 32],
}

It will then create an address per EIP-1014 and call Init.Exec4 with:

Exec4Params {
    address: b"<computed address>",
    code: EVM_ACTOR_CID,
    params: initcode,
}

Abstract Accounts

This highly depends on how code will be deployed to an abstract account. It will likely involve a CreateAccount method on the ETH Address Manager, and will also likely involve the ability to automatically deploy an abstract account to a placeholder actor in some cases.

Interactions With Unassigned Addresses

f4 addresses support interactions addresses that do not yet exist on-chain by delegating address assignment (within some f4 prefix) to a (potentially user-defined) address management actor. These interactions may be actual (a user may send to an address that has not yet been defined) or counterfactual (a user may specify such an address in a payment channel voucher, but never actually create an actor at said address).

For example, let's say we needed to be able to refer to an actor that could exist with some specific code CID and some specific constructor parameters (effectively, CREATE2).

To achieve this, we'd deploy an address manager actor with a single Exec(code_cid, constructor_params) method (i.e., identical to Init.Exec). Internally, this function would call Init.Exec4(subaddress=hash(code_cid || constructor_params, code=code_cid, params=constructor_params). In practice, this actor will likely already exist on-chain from day one.

With this address manager actor, we can refer to an actor that might exist constructed with some code and params as f4{decimal(address-management-actor)}{hash(code || params)}.

Implementation

TODO: IN PROGRESS

Copyright

Copyright and related rights waived via CC0.

Footnotes

  1. Where address manager ID address is f01111 and the sub-address is 0xeff924032365F51a36541efA24217bFc5B85bc6B the resulting textual format would be f41111f574siazdmx2runsud35ciil37rnylpdl

  2. Example of a (not being proposed here) address manager (f01112) that manages a namespace of raw ascii addresses (hello world for example), the standard format would be f41112fnbswy3dpeb3w64tmmqqq though clients might recognize the address manager and display it as text {hello world}