From fdd9e187f3810c2c8cd96d1f80206fae1f18a1b5 Mon Sep 17 00:00:00 2001 From: David Wolinsky Date: Sat, 14 Jan 2023 19:38:39 -0800 Subject: [PATCH 1/3] Update resource_groups.md --- aips/resource_groups.md | 53 +++++++++++++++++++---------------------- 1 file changed, 25 insertions(+), 28 deletions(-) diff --git a/aips/resource_groups.md b/aips/resource_groups.md index bf7fde00..92d5a342 100644 --- a/aips/resource_groups.md +++ b/aips/resource_groups.md @@ -15,31 +15,29 @@ requires: N/A This AIP proposes resource groups to support storing multiple distinct Move resources together into a single storage slot. -Note: The general feedback seen so far implies that the requirements are not well established, so I’m going to revisit the wording in the doc and maybe add more bullet points. It is also imperative to understand what a resource is and the limitations of of resources, perhaps adding a background section that covers these concepts would have improved the readability. - ## Motivation -Over the course of development, it often becomes convenient to add new fields to a resource or support an optional, heterogeneous set of resources within an account. However, resources and structs are immutable after being published to the blockchain, hence, the only pathway to add a new field is via a new resource. +Over the course of development, it often becomes convenient to add new fields to a resource or support an optional, heterogeneous set of resources. However, resources and structs are immutable after being published to the blockchain, hence, the only pathway to add a new field is via a new resource. -Each distinct resource within Aptos requires storage slot. Each storage slot is a unique entry within a Merkle tree or authenticated data structure. Each proof within the authenticated data structure occupies `32 * LogN` bytes, where `N` is the total amount of storage slots. At `N = 1,000,000`, this results in a 640 byte proof. +Each distinct resource within Aptos requires a storage slot. Each storage slot is a unique entry within a Merkle tree or authenticated data structure. Each proof within the authenticated data structure occupies `32 * LogN` bytes, where `N` is the total amount of storage slots. At `N = 1,000,000`, this results in a 640 byte proof. -Adding even a single new resource with only an event handle uses approximately 40 bytes for storing the event handle; however, it requires an additional proof which is typically orders of magnitude larger. Beyond the capacity demands, reads and writes incur additional costs associated with proof verification and generation, respectively. +With 1,000,000 storage slots in use, adding even a new resource that contains only an event handle uses approximately 680 bytes, where the event handle requires only 40. The remaining 640 bytes comes from the new authenticated data proofs, which can be orders of magnitude larger than the data being authenticated. Beyond the capacity demands, reads and writes incur additional costs associated with proof verification and generation, respectively. -Resource groups allow for dynamic, co-location of data such that adding a new event can be done even after creation of the resource group and with a fixed storage and execution costs independent of the amount of slots in storage. +Resource groups allow for dynamic, co-location of data such that adding a new event can be done even after creation of the resource group and with a fixed storage and execution costs independent of the amount of slots in storage. In turn, this provides a convenient way to evolve data types and co-locate data from different resources. ## Rationale -A resource group co-locates data into a single storage slot by encoding within the Move source files attributes that specify which resources should be combined into a single storage slot. Resource groups have no semantic effect on Move, only on the organization of storage and its performance. +A resource group co-locates data into a single storage slot by encoding within the Move source files attributes that specify which resources should be combined into a single storage slot. Resource groups have no semantic effect on Move, only on the organization of storage. -At the storage layer, the resource groups are stored as a BCS-encoded BTreeMap where the key is a BCS-encoded fully qualified struct name and the value is the BCS-encoded data associated with the resource. +At the storage layer, the resource groups are stored as a BCS-encoded BTreeMap where the key is a BCS-encoded fully qualified struct name (`address::module_name::struct_name`, e.g., `0x1::account::Account`) and the value is the BCS-encoded data associated with the resource. ### Alternative 1 — Any within a SimpleMap -One alternative that was considered is storing data in a `SimpleMap` using the `any` module. While this is a model that could be shipped without any change to Aptos-core, it incurs some drawbacks around developer and application complexity both on and off-chain. There’s no implicit caching, and therefore any read or write would require a deserialization of the object and any write would require a serialization. This means a transaction with 3 writes would result in 3 deserializations and 3 serializations. In order to get around this, the framework would need substantial, non-negligible changes and thus this was quickly abandoned. Finally, due to the lack of a common pattern, indexers and APIs would not be able to easily access this data. +One alternative that was considered is storing data in a `SimpleMap` using the `any` module. While this is a model that could be shipped without any change to Aptos-core, it incurs some drawbacks around developer and application complexity both on and off-chain. There’s no implicit caching, and therefore any read or write would require a deserialization of the object and any write would require a serialization. This means a transaction with 3 writes would result in 3 deserializations and 3 serializations. In order to get around this, the framework would need substantial, non-negligible changes, though with the emergence of `SmartMap` there may be more viability here. Finally, due to the lack of a common pattern, indexers and APIs would not be able to easily access this data. ### Alternative 2 — Generics -Another alternative was using templates. The challenge with using templates is that data cannot be partially read without knowing what the template is. For example, consider an object that might be a token. In resource groups, one could easily read the `Object` or the `Token` resource. In templates, one would need to read the `Object`. This could also be worked around by complex framework changes and risks around partially reading BCS-encoded data, an application, which has yet to be considered. The same issues in Move would impact those using the REST API. +Another alternative was using templates. The challenge with using templates is that data cannot be partially read without knowing what the template type is. For example, consider an object that might be a token. In resource groups, one could easily read the `Object` or the `Token` resource. In templates, one would need to read the `Object`. This could also be worked around by complex framework changes and risks around partially reading BCS-encoded data, an application, which has yet to be considered. The same issues in Move would impact those using the REST API. ### Generalizations of Issues @@ -47,8 +45,8 @@ There are myriad combinations between the above two approaches. In general, the - High costs associated with deserialization and serialization for each read and/or write. - The current limitations around returning a reference to global memory limit utility of generics and add overheads to reads and writes of objects. -- Limited standardization resulting in more complexity for API and Indexer usage. -- A `struct` with `key` ability has better properties than `store`. For example, the latter can lead to data being stored in arbitrary places, complicating global addressing and discoverability, which may be desirable for certain applications. +- Limitations on standards resulting in more complexity for API and Indexer usage. +- Data access within models that want to leverage a `struct` with `store`. A `struct` with `key` ability has stricter and more understandable properties than `store`. For example, the latter can lead to data being placed in arbitrary places, complicating global addressing and discoverability, which may be desirable for certain applications. ## Specification @@ -56,14 +54,14 @@ There are myriad combinations between the above two approaches. In general, the A resource group consists of several distinct resources, or a Move `struct` that has the `key` ability. -Each resource group is identified by a common container: +Each resource group is identified by a common `Move` struct: ```move -#[resource_group_container(scope = global)] +#[resource_group(scope = global)] struct ObjectGroup { } ``` -Where the container is a fully qualified struct with no fields and the attribute: `resource_group_container`. The attribute `resource_group_container` has the parameter `scope` that limits the location of other entries within the resource group: +Where this `struct` has no fields and the attribute: `resource_group`. The attribute `resource_group` has the parameter `scope` that limits the location of other entries within the resource group: - `module` — only resources defined within the same module may be stored within the same resource group. - `address` — only resources defined within the same address may be stored within the same resource group. @@ -75,15 +73,15 @@ The motivation of using a `struct` is that 2. It can build upon the existing storage model that knows how to read and write data stored at `StructTag`s. Thus it limits the implementation impact to the VM and readers of storage, storage can remain agnostic to this change. 3. Only `struct` and `fun` can have attributes, which in turn let’s us define additional parameters like `scope`. -Each entry in a resource group is identified by the `resource_group` attribute: +Each entry in a resource group is identified by the `resource_group_member` attribute: ```move -#[resource_group(container = aptos_framework::object::ObjectGroup)] +#[resource_group_member(group = aptos_framework::object::ObjectGroup)] struct Object has key { guid_creation_num: u64, } -#[resource_group(container = aptos_framework::object::ObjectGroup)] +#[resource_group_member(group = aptos_framework::object::ObjectGroup)] struct Token has key { name: String, } @@ -91,15 +89,15 @@ struct Token has key { During compilation and publishing, these attributes are checked to ensure that: -1. A `resource_group_container` has no abilities and no fields. -2. The `scope` within the `resource_group_container` can only become more permissive, that is it can either remain at a remain at the same level of accessibility or increase to the next. -3. Each entry within a resource group has a `resource_group` attribute. -4. The `container` parameter is set to a struct that is labeled as a `resource_group_container`. -5. During upgrade, an existing `struct` cannot either enter or leave a `resource_group`. +1. A `resource_group_member` has no abilities and no fields. +2. The `scope` within the `resource_group_member` can only become more permissive, that is it can either remain at a remain at the same level of accessibility or increase to the next. +3. Each entry within a resource group has a `resource_group_member` attribute. +4. The `group` parameter is set to a struct that is labeled as a `resource_group`. +5. During upgrade, an existing `struct` cannot either enter or leave a `resource_group_member`. ### Within Storage -From a storage perspective, resource group is stored as a BCS-encoded `BTreeMap`, where a `StructTag` is a known structure in Move of the form: `{ account: Address, module_name: String, struct_name: String }`. Whereas, a typical resource is stored as a `BCS encoded MoveValue`. +From a storage perspective, a resource group is stored as a BCS-encoded `BTreeMap`, where a `StructTag` is a known structure in Move of the form: `{ account: Address, module_name: String, struct_name: String }`. Whereas, a typical resource is stored as a `BCS encoded MoveValue`. At read time, a resource must be checked to see if that resource is part of a resource group by reading the associated metadata with a resource. If it is, the data is read from the resource group’s `StructTag` instead. @@ -148,9 +146,9 @@ In the current VM implementation, resources are cached upon read. This can be im ## Suggested implementation timeline -- A trivial implementation could be on Devnet by middle of January -- Assuming generally positive feedback, this could progress to the February Testnet cut -- If that goes well, it could be on Mainnet by March +- Middle of January a complete working model is available. +- Testnet cut is middle of February +- Mainnnet may land as early as middle of March ## References @@ -159,4 +157,3 @@ In the current VM implementation, resources are cached upon read. This can be im - [StructTag](https://github.com/move-language/move/blob/fcb75d8036d81e06bcb6ac102a414590e753579b/language/move-core/types/src/language_storage.rs#L91) - [Aptos Authenticated Data Structures](https://github.com/aptos-labs/aptos-core/blob/main/documentation/specifications/common/authenticated_data_structures.md) - [Move Language Book](https://move-language.github.io/move/) -- [Earlier proposal](https://www.notion.so/Storage-and-Language-d3bf1c128c4449388921fe2dde8038b6) From cda5cee8e849cc3a7c01f352c9922528cdeee9d8 Mon Sep 17 00:00:00 2001 From: David Wolinsky Date: Sun, 15 Jan 2023 22:20:57 -0800 Subject: [PATCH 2/3] Update resource_groups.md --- aips/resource_groups.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/aips/resource_groups.md b/aips/resource_groups.md index 92d5a342..6e5ff562 100644 --- a/aips/resource_groups.md +++ b/aips/resource_groups.md @@ -99,6 +99,8 @@ During compilation and publishing, these attributes are checked to ensure that: From a storage perspective, a resource group is stored as a BCS-encoded `BTreeMap`, where a `StructTag` is a known structure in Move of the form: `{ account: Address, module_name: String, struct_name: String }`. Whereas, a typical resource is stored as a `BCS encoded MoveValue`. +Resource groups introduce a new storage access path: `ResourceGroup` to distinguish from existing access paths. This provides a cleaner interface and segration of different types of storage. This becomes advantageous to indexers and other direct readers of storage that can now parse storage without inspecting module metadata. + At read time, a resource must be checked to see if that resource is part of a resource group by reading the associated metadata with a resource. If it is, the data is read from the resource group’s `StructTag` instead. At write time, an element of a resource group must be appropriately updated into a resource group by determining the delta the resource group as a result of the write operation. This results in the handful of possibilities: From 1e3f1a2718a2d651733f6480783595a902c700d9 Mon Sep 17 00:00:00 2001 From: David Wolinsky Date: Mon, 16 Jan 2023 05:54:04 -0800 Subject: [PATCH 3/3] Update resource_groups.md --- aips/resource_groups.md | 31 +++++++++++++++++++------------ 1 file changed, 19 insertions(+), 12 deletions(-) diff --git a/aips/resource_groups.md b/aips/resource_groups.md index 6e5ff562..ba62b99a 100644 --- a/aips/resource_groups.md +++ b/aips/resource_groups.md @@ -31,6 +31,10 @@ A resource group co-locates data into a single storage slot by encoding within t At the storage layer, the resource groups are stored as a BCS-encoded BTreeMap where the key is a BCS-encoded fully qualified struct name (`address::module_name::struct_name`, e.g., `0x1::account::Account`) and the value is the BCS-encoded data associated with the resource. +![image](https://user-images.githubusercontent.com/73818/212690642-f8c24ed8-8869-4ce2-9941-4958aae3f8a9.png) + +The above diagram illustrates data stored at address `0xcafef00d`. `0x1::account::Account` is a resource stored at address `0xcafef00d`. `0xaa::resource::Group` contains a set of resources or a resource group stored at the same address. The resource group packs multiple resources into the group. Resources within a resource group require nested reading, wherein first the resource group must be read from storge followed by reading the specific resource from the resource group. + ### Alternative 1 — Any within a SimpleMap One alternative that was considered is storing data in a `SimpleMap` using the `any` module. While this is a model that could be shipped without any change to Aptos-core, it incurs some drawbacks around developer and application complexity both on and off-chain. There’s no implicit caching, and therefore any read or write would require a deserialization of the object and any write would require a serialization. This means a transaction with 3 writes would result in 3 deserializations and 3 serializations. In order to get around this, the framework would need substantial, non-negligible changes, though with the emergence of `SmartMap` there may be more viability here. Finally, due to the lack of a common pattern, indexers and APIs would not be able to easily access this data. @@ -89,19 +93,27 @@ struct Token has key { During compilation and publishing, these attributes are checked to ensure that: -1. A `resource_group_member` has no abilities and no fields. -2. The `scope` within the `resource_group_member` can only become more permissive, that is it can either remain at a remain at the same level of accessibility or increase to the next. -3. Each entry within a resource group has a `resource_group_member` attribute. +1. A `resource_group` has no abilities and no fields. +2. The `scope` within the `resource_group` can only become more permissive, that is it can either remain at a remain at the same level of accessibility or increase to the next. +3. Each resource within a resource group has a `resource_group_member` attribute. 4. The `group` parameter is set to a struct that is labeled as a `resource_group`. -5. During upgrade, an existing `struct` cannot either enter or leave a `resource_group_member`. +5. During upgrade, an existing `struct` cannot either add or remove a `resource_group_member`. + +The motivation for each of these requirements are: + +1. Ensures that a `resource_group` struct won't be used for other storage purposes. While there is no strict requirement that this be true, it is intended to mitigate confusion to developers. +2. Making a scope less permissive can result in breakage of deployed `resource_group_member`s. +3. Without explicitly labeling a resource `resource_group_member`, there is no way for Move to know that it is within a `resource_group`. +4. Is discussed above as the intent to enforce clean typesafety and a single place to define the properties of the resource group. +5. If there exists data stored wtihin a resource, entering or leaving a resource group can result in that data being inaccessible. ### Within Storage From a storage perspective, a resource group is stored as a BCS-encoded `BTreeMap`, where a `StructTag` is a known structure in Move of the form: `{ account: Address, module_name: String, struct_name: String }`. Whereas, a typical resource is stored as a `BCS encoded MoveValue`. -Resource groups introduce a new storage access path: `ResourceGroup` to distinguish from existing access paths. This provides a cleaner interface and segration of different types of storage. This becomes advantageous to indexers and other direct readers of storage that can now parse storage without inspecting module metadata. +Resource groups introduce a new storage access path: `ResourceGroup` to distinguish from existing access paths. This provides a cleaner interface and segregation of different types of storage. This becomes advantageous to indexers and other direct readers of storage that can now parse storage without inspecting module metadata. Using the example above, `0x1::account::Account` is stored at `AccessPath::Resource(0xcafef00d, 0x1::account::Account)`, whereas the resource group and its contents are stored at `AccessPath::ResourceGroup(0xcafef00d, 0xaa::resource::Group)` -At read time, a resource must be checked to see if that resource is part of a resource group by reading the associated metadata with a resource. If it is, the data is read from the resource group’s `StructTag` instead. +The only way to tell that a resource is within a resource group is by reading the module metadata associated with the resource. After reading module metadata, the storage client should either directly read form the `AccessPath::Resource` or by first reading `AccessPath::ResourceGroup` followed by deserializing the `BTreeMap` and then extracting the appropriate resource. At write time, an element of a resource group must be appropriately updated into a resource group by determining the delta the resource group as a result of the write operation. This results in the handful of possibilities: @@ -118,12 +130,7 @@ The implications for the gas schedule are: ### Within the Interface -To read a resource group from storage: - -- Attempt to read the resource from an account directly and find that it does not exist -- Read the `struct` metadata from storage and find that it is within a resource group -- Read the resource group from storage -- Parse and return the resource from the resource group +The above text in storage discusses the layout for resources and resources groups. User facing interfaces, such as a REST API, should not be exposed to resource groups. It is entirely a Move concept. A direct read on a resource group should be avoided. A resource group should be flattened and included within a set of resources when reading bulk resources at an address. ## Reference Implementation