fip | title | author | discussions-to | status | type | category | created | spec-sections | requires | replaces |
---|---|---|---|---|---|---|---|---|---|---|
0049 |
Actor events |
Raúl Kripalani (@raulk), Steven Allen (@stebalien) |
Final |
Technical Core |
Core |
2022-10-12 |
N/A |
N/A |
Table of Contents generated with DocToc
- Simple Summary
- Abstract
- Change Motivation
- Specification
- Design Rationale
- Backwards Compatibility
- Test Cases
- Security Considerations
- Incentive Considerations
- Product Considerations
- Implementation
- Copyright
This FIP adds the ability for actors to emit externally observable events during execution. Events are fire-and-forget, and can be thought of as a side effect from execution. The payloads of events are self-describing objects signaling that some relevant circumstance, action, or transition has ocurred. Actor events enable external agents to observe the activity on chain, and may eventually become the source of internal message triggers. Events cannot be queried or accessed by actor logic (hence the usage of the term "external").
This FIP introduces a minimal design for actor events, including their schema,
a new syscall, a new field in the MessageReceipt
structure, and mechanics to
commit these execution side effects on the chain. By minimal we mean that the
protocol stays largely unopinionated and unprescriptive around higher level
concerns like event indexing, traversal, and proofs (e.g. inclusion proofs for
light clients), and other future higher-level features.
Two main motivations warrant the introduction of actor events at this time.
-
They have been long missing from the protocol, complicating observability and forcing Filecoin network monitoring or accounting tools resort to creative approaches to obtain the data they needed, e.g. forking and instrumenting built-in actors code, or replaying messages and performing state tree diffs.
While this may have been tolerable while the system had a limited number of actors, such approaches will no longer scale in a world of discretionary programs on chain.
A future FIP could enhance built-in actors to emit events when power changes, sectors are onboarded, deals are made, sectors are terminated, penalties are incurred, etc.
-
The upcoming introduction of the Filecoin Ethereum Virtual Machine (FEVM) runtime necessitates an event mechanism at the protocol layer to handle the
LOG{0..4}
EVM opcodes, and the subsequent serving of such data through external JSON-RPC methods exposed by the client.
We introduce the following data model to represent and commit events on chain. These objects are DAG-CBOR encoded with tuple encoding.
/// Represents an event emitted throughout message execution, stamped by the FVM
/// with additional metadata, and committed on chain.
struct StampedEvent {
/// Carries the actor ID of the emitting actor (stamped by the FVM).
emitter: u64,
/// Payload of the event, as emitted by the actor.
event: ActorEvent,
}
/// An event as originally emitted by the actor.
type ActorEvent = [Entry];
struct Entry {
/// A bitmap conveying metadata or hints about this entry.
flags: u64,
/// The key of this entry.
key: String,
/// The value's IPLD codec (0 is defined to mean "no value/codec".
codec: u64,
/// The value of this entry as a byte string.
value: Bytes,
}
Its main constituent is a list of ordered key-value entries. This approach is inspired by structured logging concepts. Keys are UTF-8 strings, while values can be any IPLD block.
However, this FIP restricts the acceptable codecs to IPLD_RAW
(raw bytes) with
the IPLD codec 0x55. A future FIP will loosen this restriction.
Every entry contains a flags bitmap with underlying type u64, conveying metadata or hints about the entry. These are the currently supported flags:
hex | binary | meaning |
---|---|---|
0x01 |
b00000001 |
indexed by key (hint) |
0x02 |
b00000010 |
indexed by value (hint) |
Flags are additive, such that value 0x03 (b00000011
) demands that the entry be
indexed by both key and value.
A zero-valued bitflag indicates that no flags are in effect.
The two concrete flags introduced herein serve merely as client hints to signal indexing intent with different scopes. Clients should honor the request, and construct indices that facilitate corresponding fast lookups.
Event examples
The following examples demonstrate how to use these structures in practice. Payloads are DAG-CBOR encoded types.
Ethereum-style event
See FIP-0054 for more information.
Event {
emitter: 1234,
entries: [
(0x03, "t1", 0x55, <32-byte array>), // 0x55 = IPLD_RAW
(0x03, "t2", 0x55, <32-byte array>),
(0x03, "t3", 0x55, <32-byte array>),
(0x03, "t4", 0x55, <32-byte array>),
(0x03, "p", 0x55, <32-byte array>),
],
}
Fungible token transfer
NOTE: value types other than byte string are not supported yet.
// type, sender, and receiver indexed by key and value.
Event {
emitter: 1260,
entries: [
(0x03, "type", 0x51, cbor_encode("transfer")), // 0x51 = CBOR
(0x03, "sender", 0x51, cbor_encode(<actor id>)),
(0x03, "receiver", 0x51, cbor_encode(<actor id>)),
// BE-encoded BigInt, Not indexed.
(0x00, "amount", 0x51, cbor_encode(<token amount in atto precision>)),
],
}
Non-fungible token transfer
NOTE: value types other than byte string are not supported yet.
// All entries indexed, carrying the event type as a key (transfer).
Event {
emitter: 101,
entries: [
(0x01, "transfer", 0x0, []), // no codec, no value.
(0x03, "sender", 0x51, cbor_encode(<actor id>)),
(0x03, "receiver", 0x51, cbor_encode(<actor id>)),
(0x03, "token_id", 0x51, cbor_encode(<internal identifier>)),
],
}
Access control
NOTE: value types not supported yet.
// Compact representation of an "access denied" event, indexing the object id but not its key.
Event {
emitter: 480,
entries: [
(0x01, "denied", 0x0, []), // no codec, no value.
(0x02, "id", 0x51, cbor_encode(<object id>)),
],
}
Multisig approval
NOTE: value types not supported yet.
// Only type and signers indexed; signer is a variadic entry.
Event {
emitter: 1678,
entries: [
(0x03, "type", "approval"),
(0x03, "signer", 0x51, cbor_encode(<actor id>)),
(0x03, "signer", 0x51, cbor_encode(<actor id>)),
(0x03, "signer", 0x51, cbor_encode(<actor id>)),
// Not indexed.
(0x00, "callee", 0x51, cbor_encode(<actor id>)),
// BE-encoded BigInt, Not indexed.
(0x00, "amount", 0x51, cbor_encode(<token amount in atto precision>)),
// Not indexed.
(0x00, "msg_cid", 0x51, cbor_encode(<message cid>)),
],
}
We define a new syscall under the vm
namespace so that actors can emit events.
/// Emits an actor event. It takes a CBOR encoded ActorEvent that has been
/// written to Wasm memory, passed by memory offset and length.
///
/// The FVM validates the structural, syntatic, and semantic correctness of the
/// supplied event, and errors with `IllegalArgument` if the payload was invalid.
///
/// Calling this syscall may immediately halt execution with an out of gas error,
/// if such condition arises.
///
/// # Errors
///
/// | Error | Reason |
/// |---------------------|-------------------------------------------------------------------|
/// | [`IllegalArgument`] | Event failed to validate due to improper encoding or invalid data |
/// | [`ReadOnly`] | Call rejected because the kernel was in read-only mode |
fn emit_event(event_off: u32, event_len: u32) -> Result<()>;
This syscall is rejected with a ReadOnly
error number when the kernel is in
read-only mode.
This syscall immediately validates the input as per below. Validation failures
result in error number IllegalArgument
.
If validation passes, the syscall creates a StampedEvent
object, sets the
ActorID of the emitter in the emitter
field, and stores it in the internal
events accumulator.
- Validate the structural correctness of the input, optionally deserializing
into an in-memory representation of the
ActorEvent
. - Validate that each entry's value codec is acceptable. Currently, only
IPLD_RAW
(0x55) is allowed). - Validate that each entry's key is at most 32 bytes long and is valid utf-8.
- Validate that the event has at most 256 entries.
- Validate that the total length of all entry data combined is at most 8KiB.
During message call stack execution, StampedEvent
s are accumulated within the
FVM.
- When an invocation container for an actor exits normally (exit code 0), events emitted from that invocation and its transitive callees are retained.
- When an actor exits abnormally (exit code > 0), events emitted from that invocation and its transitive callee are discarded.
- Out of gas or fatal errors cause all events emitted by the call stack to be discarded.
At the end of successful message execution, the FVM serializes all retained
StampedEvent
s to DAG-CBOR and aggregates them into an AMT with bitwidth 5, in
their natural position. The AMT's root CID is then returned to the client, along
with the events themselves, together with the other results from message
execution.
The existing MessageReceipt
chain data structure is augmented with a new
events_root
field:
struct MessageReceipt {
exit_code: ExitCode,
return_data: RawBytes,
gas_used: i64,
// Root of AMT<StampedEvent, bitwidth=5>
// Absent when no events have been emitted.
// Serializes as a CBOR null value when absent, and as a CBOR IPLD Link when present.
events_root: Option<Cid>,
}
Because the network agrees on the content of receipts through the
ParentMessageReceipts
field on the BlockHeader
, actor events are implicitly
committed to the chain via this new events_root
field.
In addition to the syscall gas cost, emitting an event carries two fees:
- A validation fee, applied on entering the syscall.
- A processing fee, applied on successful validation and event acceptance.
// The length of the ActorEvent data structure as supplied by the actor.
v_input_len := len(actor_event)
// Gas parameters for validation.
// Syscall gas enough to cover the flat fee.
p_validation_cost_flat := 1750
p_validation_cost_per_byte := 25
// Validation fee. Covers the cost of validating the incoming `ActorEvent` object.
validation_fee := (p_memcpy_gas_per_byte * v_input_len)
+ p_validation_cost_flat
+ (p_validation_cost_per_byte * v_input_len)
// The estimated size of the StampedEvent. It is the length of the supplied ActorEvent
// + 9 bytes for the actor ID (worst case scenario) + some extra bytes for CBOR framing.
v_stamped_event_estsize := v_input_len + 12
v_total_indexed_elements := <number of elements being indexed, where keys and values count separately>
v_sum_indexed_bytes := <sum of indexed bytes across all elements>
// Cost per element indexed, assuming one indexing entry per element.
// Zero for now because indexing is not consensus-critical and
// not in the hot path of block validation or production.
p_index_cost_per_element := 0
// Cost per byte indexed, equated to the storage cost per byte.
p_index_cost_per_byte := p_storage_per_byte
// Processing fee
//
// Covers:
// - Cost of stamping the event and reserializing to a `StampedEvent`.
// - Cost of allocating the `StampedEvent`.
// - Anticipated cost of indexing the indexed entries.
// - Anticipated cost of writing into the AMT (allocation).
// - Anticipated cost of hashing AMT entries.
// - Anticipated cost of returning the event to the client.
processing_fee := 3 * (p_memcpy_gas_per_byte * i_stamped_event_estsize)
+ 2 * (p_block_alloc_cost.apply(i_stamped_event_estsize))
+ (p_hash_cost[Blake2b-256b1].apply(i_stamped_event_estsize))
+ (p_index_cost_per_element * i_total_indexed_elements)
+ (p_index_cost_per_byte * i_sum_indexed_bytes)
Refer to FIP-0054.
A large part of the architectural discussion has been documented in issue filecoin-project/ref-fvm#728. Some ideas we considered along the way included:
- Keeping events away from the protocol and in actor land, by tracking them in an Events actor.
- Populating event indices and committing them on the block header.
- Adaptable bloom filters, advertised on the block header.
- Wrapping events in a broader concept of "execution artifacts", which in the future could be extended to include callbacks, futures, and more.
We deferred most ideas to the future and kept this initial proposal simple, yet extensible, focusing only on the fundamentals. A particular area to conduct further research into is Filecoin light clients and the kinds of proofs that could endorse queries for events, including inclusion, non-inclusion, and completeness proofs.
In terms of the technical design, we considered making the event payload an opaque IPLD blob accompanied by some form of a type discriminator. However, that would require an IDL upfront to interpret the payload.
The straightforward key-value inspired data model we've proposed enables straightforward introspection and analysis by chain explorers, developer tools, and monitoring tools.
Choosing a list of tuples instead of a map representation for entries allows for strict ordering and enables repeatable keys, and, thus, array-like entries, e.g. signers in a multisig transaction.
Over time, we expect the community:
- To standardise on a set of keys through FRC proposals.
- To offer utilities, macros, and sugar to construct and emit event payloads within FVM SDKs.
We had initially planned for a top-level indexed
bitmap to convey indexing
hints, but later pivoted towards attaching metadata to every entry in the form
of flags
. This enables extensibility and future-proofing. We consider the
resulting size tradeoff to be acceptable.
In the future, flags may lead to consensus-relevant outcomes, such as populating (adaptable) bloom filters, committing block-level indices, supporting on-chain event subscriptions, special type indicators (e.g. "value is an address"), and more.
DAG-CBOR encoding guarantees that flag bitmaps with up to 4 bits will be minimally encoded as a single byte (major type 0, with inlined value in additional data).
We roll up all effectively-retained events from a call stack into an IPLD AMT, and stamp the receipt with its root CID. Unfortunately, the IPLD AMT data structure is not capable of producing efficient Merkle inclusion proofs, since while leaf nodes are linked to by CID, the data itself is embedded in the node data structure (with up to 2^bitwidth items in each), thus complicating the last leg of the proof.
This inability specifically affects light clients operations. However, above we explicitly descoped catering for eventual light clients with this proposal, as the Filecoin community has not agreed upon light client architecture yet. Therefore, any steps taken now would be premature, speculative and would be predicated on assumptions that may be invalidated when an architecture is discussed and established.
We explicitly accept that events produced with this proposal may not be provable to future light clients via efficient proofs. These events may be classified as "historical events" once the time comes.
We had two options to deal with receipts with no events. We either made
events_root
a nillable field, or we used an empty IPLD AMT root. We chose the
former to:
- Keep the serialized representation of receipts more compact. Even though receipts are never transmitted over the wire, clients may want to start storing them to prevent recomputation, as we forecast an increasing frequency of access from Filecoin dapps and wallets.
- To prevent potential special-casing in implementations against the well-known empty AMT CID, and/or, an extra an IPLD store load when resolving receipts.
The FVM does not write receipts to the blockstore, neither does it write the events AMT themselves. Doing so would result in a dangling write. Clients are responsible for managing event data as they wish, just like they're responsible for managing receipts.
The event limits were chosen to limit:
- The amount of memory required to process/store an event.
- The complexity/time required to process an event.
- The maximum key length of 32 bytes was chosen to allow efficient indexing and to discourage using keys for complex values.
- The maximum total value size of 8KiB was chosen to keep the event AMT leaf size well below 1MiB.
- The event entry limit was chosen to be 256 was also chosen to keep the total event size small.
Overall, these limits lead to 8KiB of keys (max) and 8KiB of values (max) in a single event.
During the gas parameterization work, we identified that decoding CBOR events has variable costs that are hard to predict ahead of time. Concretely, we found that compute time scales with the number of event entries, which cannot be learnt until decoding is performed (unless we conduct an O(n) pre-scan of the data structure).
For this reason, we fitted our modelling to the concrete shapes of events emitted by the EVM runtime (the only user of this FIP at the moment of introduction). That is:
- Entry count: maximum 5 (
LOG4
instruction). - Entry key length: 1 and 2 characters.
- Entry value length: 32-byte values, except for the last one (
d
key: EVM log data), whose size is user-determined.
Furthermore, we sized the parameters conservatively, acknowledging some over estimation when the EVM log data is large.
We will need to revisit the parameters and/or iterate on the API before
extending the usage of this feature further. See discussion at
filecoin-project/ref-fvm#1637
for ideas.
The gas calibration harness we used is located under
testing/calibration
in the filecoin-project/ref-fvm
repo.
We are adding a new field to the MessageReceipt
type. While this type is not
serialized over the wire nor gossiped, it is returned from JSON-RPC calls.
Hence, users of Filecoin clients may notice a new field, which could in turn
break some of their code.
Refer to events test cases in the reference implementation permalink at time of writing.
This FIP introduces a new feature with an associated chain data structure change. A dedicated gas schedule has been introduced covering computational overheads of the base feature, as well as subfeatures (e.g. indexing).
Attempts to write large amounts of data through events don't result in writes to the underlying blockstore that backs the state tree. Thus, there is limited to nil potential for contagion to other parts of the system in case of abuse.
No incentive considerations apply.
Even though we've been seeking to introduce on-chain observability for some time now, we explicitly recognize that the immediate motivation is EVM compatibility.
Upcoming network versions should strive to update built-in actors to emit actor events for meaningful transitions and circumstances. This would allow observability, monitoring, accounting and other external tools to stop depending on highly-coupled, internal and low-level mechanisms to source information, such as state replays, state diffs, execution traces, and others.
The reference implementation is being carried out under
filecoin-project/ref-fvm
, and its first adoption is within the EVM runtime
actor under filecoin-project/builtin-actors
.
The FVM accumulates events into a sequence and calculates the aggregated commitment (event AMT root CID). However, managing event data is entirely the responsibility of the client.
Honouring indexing hints is optional but highly encouraged at this stage. Clients may offer new JSON-RPC operations to subscribe to and query events. That's the case of Lotus, as originally introduced in PR #9623.
Clients may prefer to store and track event data in a dedicated store, segregated from the chain store, potentially backed by a different database engine more suited to the expected write, query, and indexing patterns for this data.
Copyright and related rights waived via CC0.