Skip to content

Latest commit

 

History

History
604 lines (470 loc) · 23.8 KB

fip-0049.md

File metadata and controls

604 lines (470 loc) · 23.8 KB
fip title author discussions-to status type category created spec-sections requires replaces
0049
Actor events
Raúl Kripalani (@raulk), Steven Allen (@stebalien)
Final
Technical Core
Core
2022-10-12
N/A
N/A

FIP-0049: Actor events

Table of Contents generated with DocToc

Simple Summary

This FIP adds the ability for actors to emit externally observable events during execution. Events are fire-and-forget, and can be thought of as a side effect from execution. The payloads of events are self-describing objects signaling that some relevant circumstance, action, or transition has ocurred. Actor events enable external agents to observe the activity on chain, and may eventually become the source of internal message triggers. Events cannot be queried or accessed by actor logic (hence the usage of the term "external").

Abstract

This FIP introduces a minimal design for actor events, including their schema, a new syscall, a new field in the MessageReceipt structure, and mechanics to commit these execution side effects on the chain. By minimal we mean that the protocol stays largely unopinionated and unprescriptive around higher level concerns like event indexing, traversal, and proofs (e.g. inclusion proofs for light clients), and other future higher-level features.

Change Motivation

Two main motivations warrant the introduction of actor events at this time.

  1. They have been long missing from the protocol, complicating observability and forcing Filecoin network monitoring or accounting tools resort to creative approaches to obtain the data they needed, e.g. forking and instrumenting built-in actors code, or replaying messages and performing state tree diffs.

    While this may have been tolerable while the system had a limited number of actors, such approaches will no longer scale in a world of discretionary programs on chain.

    A future FIP could enhance built-in actors to emit events when power changes, sectors are onboarded, deals are made, sectors are terminated, penalties are incurred, etc.

  2. The upcoming introduction of the Filecoin Ethereum Virtual Machine (FEVM) runtime necessitates an event mechanism at the protocol layer to handle the LOG{0..4} EVM opcodes, and the subsequent serving of such data through external JSON-RPC methods exposed by the client.

Specification

New chain types

We introduce the following data model to represent and commit events on chain. These objects are DAG-CBOR encoded with tuple encoding.

/// Represents an event emitted throughout message execution, stamped by the FVM
/// with additional metadata, and committed on chain.
struct StampedEvent {
    /// Carries the actor ID of the emitting actor (stamped by the FVM).
    emitter: u64,
    /// Payload of the event, as emitted by the actor.
    event: ActorEvent,
}

/// An event as originally emitted by the actor.
type ActorEvent = [Entry];

struct Entry {
    /// A bitmap conveying metadata or hints about this entry.
    flags: u64,
    /// The key of this entry.
    key: String,
    /// The value's IPLD codec (0 is defined to mean "no value/codec".
    codec: u64,
    /// The value of this entry as a byte string.
    value: Bytes,
}

Its main constituent is a list of ordered key-value entries. This approach is inspired by structured logging concepts. Keys are UTF-8 strings, while values can be any IPLD block.

However, this FIP restricts the acceptable codecs to IPLD_RAW (raw bytes) with the IPLD codec 0x55. A future FIP will loosen this restriction.

Every entry contains a flags bitmap with underlying type u64, conveying metadata or hints about the entry. These are the currently supported flags:

hex binary meaning
0x01 b00000001 indexed by key (hint)
0x02 b00000010 indexed by value (hint)

Flags are additive, such that value 0x03 (b00000011) demands that the entry be indexed by both key and value.

A zero-valued bitflag indicates that no flags are in effect.

The two concrete flags introduced herein serve merely as client hints to signal indexing intent with different scopes. Clients should honor the request, and construct indices that facilitate corresponding fast lookups.

Event examples

The following examples demonstrate how to use these structures in practice. Payloads are DAG-CBOR encoded types.

Ethereum-style event

See FIP-0054 for more information.

Event {
    emitter: 1234,
    entries: [
        (0x03, "t1", 0x55, <32-byte array>), // 0x55 = IPLD_RAW
        (0x03, "t2", 0x55, <32-byte array>),
        (0x03, "t3", 0x55, <32-byte array>),
        (0x03, "t4", 0x55, <32-byte array>),
        (0x03, "p", 0x55, <32-byte array>),
    ],
}

Fungible token transfer

NOTE: value types other than byte string are not supported yet.

// type, sender, and receiver indexed by key and value.
Event {
    emitter: 1260,
    entries: [
        (0x03, "type", 0x51, cbor_encode("transfer")), // 0x51 = CBOR
        (0x03, "sender", 0x51, cbor_encode(<actor id>)),
        (0x03, "receiver", 0x51, cbor_encode(<actor id>)),
        // BE-encoded BigInt, Not indexed.
        (0x00, "amount", 0x51, cbor_encode(<token amount in atto precision>)),
    ],
}

Non-fungible token transfer

NOTE: value types other than byte string are not supported yet.

// All entries indexed, carrying the event type as a key (transfer).
Event {
    emitter: 101,
    entries: [
        (0x01, "transfer", 0x0, []), // no codec, no value.
        (0x03, "sender", 0x51, cbor_encode(<actor id>)),
        (0x03, "receiver", 0x51, cbor_encode(<actor id>)),
        (0x03, "token_id", 0x51, cbor_encode(<internal identifier>)),
    ],
}

Access control

NOTE: value types not supported yet.

// Compact representation of an "access denied" event, indexing the object id but not its key.
Event {
    emitter: 480,
    entries: [
        (0x01, "denied", 0x0, []), // no codec, no value.
        (0x02, "id", 0x51, cbor_encode(<object id>)),
    ],
}

Multisig approval

NOTE: value types not supported yet.

// Only type and signers indexed; signer is a variadic entry.
Event {
    emitter: 1678,
    entries: [
        (0x03, "type", "approval"),
        (0x03, "signer", 0x51, cbor_encode(<actor id>)),
        (0x03, "signer", 0x51, cbor_encode(<actor id>)),
        (0x03, "signer", 0x51, cbor_encode(<actor id>)),
        // Not indexed.
        (0x00, "callee", 0x51, cbor_encode(<actor id>)),
        // BE-encoded BigInt, Not indexed.
        (0x00, "amount", 0x51, cbor_encode(<token amount in atto precision>)),
        // Not indexed.
        (0x00, "msg_cid", 0x51, cbor_encode(<message cid>)),
    ],
}

New vm::emit_event syscall

We define a new syscall under the vm namespace so that actors can emit events.

/// Emits an actor event. It takes a CBOR encoded ActorEvent that has been
/// written to Wasm memory, passed by memory offset and length.
/// 
/// The FVM validates the structural, syntatic, and semantic correctness of the
/// supplied event, and errors with `IllegalArgument` if the payload was invalid.
///
/// Calling this syscall may immediately halt execution with an out of gas error,
/// if such condition arises.
///
/// # Errors
///
/// | Error               | Reason                                                            |
/// |---------------------|-------------------------------------------------------------------|
/// | [`IllegalArgument`] | Event failed to validate due to improper encoding or invalid data |
/// | [`ReadOnly`]        | Call rejected because the kernel was in read-only mode            |
fn emit_event(event_off: u32, event_len: u32) -> Result<()>;

This syscall is rejected with a ReadOnly error number when the kernel is in read-only mode.

This syscall immediately validates the input as per below. Validation failures result in error number IllegalArgument.

If validation passes, the syscall creates a StampedEvent object, sets the ActorID of the emitter in the emitter field, and stores it in the internal events accumulator.

Event validation rules

  1. Validate the structural correctness of the input, optionally deserializing into an in-memory representation of the ActorEvent.
  2. Validate that each entry's value codec is acceptable. Currently, only IPLD_RAW (0x55) is allowed).
  3. Validate that each entry's key is at most 32 bytes long and is valid utf-8.
  4. Validate that the event has at most 256 entries.
  5. Validate that the total length of all entry data combined is at most 8KiB.

Event accumulation and aggregation

During message call stack execution, StampedEvents are accumulated within the FVM.

  • When an invocation container for an actor exits normally (exit code 0), events emitted from that invocation and its transitive callees are retained.
  • When an actor exits abnormally (exit code > 0), events emitted from that invocation and its transitive callee are discarded.
  • Out of gas or fatal errors cause all events emitted by the call stack to be discarded.

At the end of successful message execution, the FVM serializes all retained StampedEvents to DAG-CBOR and aggregates them into an AMT with bitwidth 5, in their natural position. The AMT's root CID is then returned to the client, along with the events themselves, together with the other results from message execution.

Chain commitment

The existing MessageReceipt chain data structure is augmented with a new events_root field:

struct MessageReceipt {
    exit_code: ExitCode,
    return_data: RawBytes,
    gas_used: i64,
    // Root of AMT<StampedEvent, bitwidth=5>
    // Absent when no events have been emitted.
    // Serializes as a CBOR null value when absent, and as a CBOR IPLD Link when present.
    events_root: Option<Cid>,
}

Because the network agrees on the content of receipts through the ParentMessageReceipts field on the BlockHeader, actor events are implicitly committed to the chain via this new events_root field.

Gas costs

In addition to the syscall gas cost, emitting an event carries two fees:

  1. A validation fee, applied on entering the syscall.
  2. A processing fee, applied on successful validation and event acceptance.

Validation fee

// The length of the ActorEvent data structure as supplied by the actor.
v_input_len := len(actor_event)

// Gas parameters for validation.
// Syscall gas enough to cover the flat fee.
p_validation_cost_flat := 1750
p_validation_cost_per_byte := 25

// Validation fee. Covers the cost of validating the incoming `ActorEvent` object.
validation_fee := (p_memcpy_gas_per_byte * v_input_len)
                  + p_validation_cost_flat
                  + (p_validation_cost_per_byte * v_input_len)

Processing fee

// The estimated size of the StampedEvent. It is the length of the supplied ActorEvent
// + 9 bytes for the actor ID (worst case scenario) + some extra bytes for CBOR framing.
v_stamped_event_estsize := v_input_len + 12
v_total_indexed_elements := <number of elements being indexed, where keys and values count separately>
v_sum_indexed_bytes := <sum of indexed bytes across all elements>

// Cost per element indexed, assuming one indexing entry per element.
// Zero for now because indexing is not consensus-critical and
// not in the hot path of block validation or production.
p_index_cost_per_element := 0
// Cost per byte indexed, equated to the storage cost per byte.
p_index_cost_per_byte := p_storage_per_byte

// Processing fee
//
// Covers:
// - Cost of stamping the event and reserializing to a `StampedEvent`.
// - Cost of allocating the `StampedEvent`.
// - Anticipated cost of indexing the indexed entries.
// - Anticipated cost of writing into the AMT (allocation).
// - Anticipated cost of hashing AMT entries.
// - Anticipated cost of returning the event to the client.
processing_fee := 3 * (p_memcpy_gas_per_byte * i_stamped_event_estsize)
                  + 2 * (p_block_alloc_cost.apply(i_stamped_event_estsize))
                  + (p_hash_cost[Blake2b-256b1].apply(i_stamped_event_estsize))
                  + (p_index_cost_per_element * i_total_indexed_elements)
                  + (p_index_cost_per_byte * i_sum_indexed_bytes)

Mapping to EVM logs

Refer to FIP-0054.

Design Rationale

Brainstorm and simplification

A large part of the architectural discussion has been documented in issue filecoin-project/ref-fvm#728. Some ideas we considered along the way included:

  • Keeping events away from the protocol and in actor land, by tracking them in an Events actor.
  • Populating event indices and committing them on the block header.
  • Adaptable bloom filters, advertised on the block header.
  • Wrapping events in a broader concept of "execution artifacts", which in the future could be extended to include callbacks, futures, and more.

We deferred most ideas to the future and kept this initial proposal simple, yet extensible, focusing only on the fundamentals. A particular area to conduct further research into is Filecoin light clients and the kinds of proofs that could endorse queries for events, including inclusion, non-inclusion, and completeness proofs.

Data model

In terms of the technical design, we considered making the event payload an opaque IPLD blob accompanied by some form of a type discriminator. However, that would require an IDL upfront to interpret the payload.

The straightforward key-value inspired data model we've proposed enables straightforward introspection and analysis by chain explorers, developer tools, and monitoring tools.

Choosing a list of tuples instead of a map representation for entries allows for strict ordering and enables repeatable keys, and, thus, array-like entries, e.g. signers in a multisig transaction.

Over time, we expect the community:

  1. To standardise on a set of keys through FRC proposals.
  2. To offer utilities, macros, and sugar to construct and emit event payloads within FVM SDKs.

Flags

We had initially planned for a top-level indexed bitmap to convey indexing hints, but later pivoted towards attaching metadata to every entry in the form of flags. This enables extensibility and future-proofing. We consider the resulting size tradeoff to be acceptable.

In the future, flags may lead to consensus-relevant outcomes, such as populating (adaptable) bloom filters, committing block-level indices, supporting on-chain event subscriptions, special type indicators (e.g. "value is an address"), and more.

DAG-CBOR encoding guarantees that flag bitmaps with up to 4 bits will be minimally encoded as a single byte (major type 0, with inlined value in additional data).

Receipt data structure

We roll up all effectively-retained events from a call stack into an IPLD AMT, and stamp the receipt with its root CID. Unfortunately, the IPLD AMT data structure is not capable of producing efficient Merkle inclusion proofs, since while leaf nodes are linked to by CID, the data itself is embedded in the node data structure (with up to 2^bitwidth items in each), thus complicating the last leg of the proof.

This inability specifically affects light clients operations. However, above we explicitly descoped catering for eventual light clients with this proposal, as the Filecoin community has not agreed upon light client architecture yet. Therefore, any steps taken now would be premature, speculative and would be predicated on assumptions that may be invalidated when an architecture is discussed and established.

We explicitly accept that events produced with this proposal may not be provable to future light clients via efficient proofs. These events may be classified as "historical events" once the time comes.

Receipt events_root optionality

We had two options to deal with receipts with no events. We either made events_root a nillable field, or we used an empty IPLD AMT root. We chose the former to:

  1. Keep the serialized representation of receipts more compact. Even though receipts are never transmitted over the wire, clients may want to start storing them to prevent recomputation, as we forecast an increasing frequency of access from Filecoin dapps and wallets.
  2. To prevent potential special-casing in implementations against the well-known empty AMT CID, and/or, an extra an IPLD store load when resolving receipts.

Event data management

The FVM does not write receipts to the blockstore, neither does it write the events AMT themselves. Doing so would result in a dangling write. Clients are responsible for managing event data as they wish, just like they're responsible for managing receipts.

Event limits

The event limits were chosen to limit:

  1. The amount of memory required to process/store an event.
  2. The complexity/time required to process an event.
  • The maximum key length of 32 bytes was chosen to allow efficient indexing and to discourage using keys for complex values.
  • The maximum total value size of 8KiB was chosen to keep the event AMT leaf size well below 1MiB.
  • The event entry limit was chosen to be 256 was also chosen to keep the total event size small.

Overall, these limits lead to 8KiB of keys (max) and 8KiB of values (max) in a single event.

Gas parameters and future syscall API improvements

During the gas parameterization work, we identified that decoding CBOR events has variable costs that are hard to predict ahead of time. Concretely, we found that compute time scales with the number of event entries, which cannot be learnt until decoding is performed (unless we conduct an O(n) pre-scan of the data structure).

For this reason, we fitted our modelling to the concrete shapes of events emitted by the EVM runtime (the only user of this FIP at the moment of introduction). That is:

  • Entry count: maximum 5 (LOG4 instruction).
  • Entry key length: 1 and 2 characters.
  • Entry value length: 32-byte values, except for the last one (d key: EVM log data), whose size is user-determined.

Furthermore, we sized the parameters conservatively, acknowledging some over estimation when the EVM log data is large.

We will need to revisit the parameters and/or iterate on the API before extending the usage of this feature further. See discussion at filecoin-project/ref-fvm#1637 for ideas.

The gas calibration harness we used is located under testing/calibration in the filecoin-project/ref-fvm repo.

Backwards Compatibility

We are adding a new field to the MessageReceipt type. While this type is not serialized over the wire nor gossiped, it is returned from JSON-RPC calls. Hence, users of Filecoin clients may notice a new field, which could in turn break some of their code.

Test Cases

Refer to events test cases in the reference implementation permalink at time of writing.

Security Considerations

This FIP introduces a new feature with an associated chain data structure change. A dedicated gas schedule has been introduced covering computational overheads of the base feature, as well as subfeatures (e.g. indexing).

Attempts to write large amounts of data through events don't result in writes to the underlying blockstore that backs the state tree. Thus, there is limited to nil potential for contagion to other parts of the system in case of abuse.

Incentive Considerations

No incentive considerations apply.

Product Considerations

Required for EVM compatibility

Even though we've been seeking to introduce on-chain observability for some time now, we explicitly recognize that the immediate motivation is EVM compatibility.

Observability of built-in actors

Upcoming network versions should strive to update built-in actors to emit actor events for meaningful transitions and circumstances. This would allow observability, monitoring, accounting and other external tools to stop depending on highly-coupled, internal and low-level mechanisms to source information, such as state replays, state diffs, execution traces, and others.

Implementation

Reference implementation

The reference implementation is being carried out under filecoin-project/ref-fvm, and its first adoption is within the EVM runtime actor under filecoin-project/builtin-actors.

Separation of concerns

The FVM accumulates events into a sequence and calculates the aggregated commitment (event AMT root CID). However, managing event data is entirely the responsibility of the client.

Honouring indexing hints is optional but highly encouraged at this stage. Clients may offer new JSON-RPC operations to subscribe to and query events. That's the case of Lotus, as originally introduced in PR #9623.

Clients may prefer to store and track event data in a dedicated store, segregated from the chain store, potentially backed by a different database engine more suited to the expected write, query, and indexing patterns for this data.

Copyright

Copyright and related rights waived via CC0.