Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[framework] smart_vector and big_vector #5690

Merged
merged 1 commit into from
Dec 17, 2022
Merged

[framework] smart_vector and big_vector #5690

merged 1 commit into from
Dec 17, 2022

Conversation

lightmark
Copy link
Contributor

Description

This PR productionize big_vector and smart_vector to help move developer handle large sequencing containers on Aptos.
Basically they use Table as bucket to batch k elements in one table item.

Test Plan

cargo test

@msmouse
Copy link
Contributor

msmouse commented Nov 28, 2022

without actual reviewing: should delete aptos-core/aptos-move/move-examples/data_structures/sources/big_vector.move

@lightmark lightmark force-pushed the smart_vector branch 2 times, most recently from b21661a to 22aa836 Compare November 28, 2022 23:40
@msmouse msmouse requested review from msmouse and removed request for msmouse November 29, 2022 01:23
@msmouse
Copy link
Contributor

msmouse commented Nov 29, 2022

(sorry, accidental unqualified approval)

/// `SmartVector` should not enforece that.
struct SmartVector<T> has store {
inline_inner: vector<T>,
table_inner: vector<BigVector<T>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this vector<BigVector> not BigVector?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vector here is actually an option w/o Drop ability requirement of T to save space of a table handle when the smart_vector is small.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have some documentation here? otherwise, it is quite hard to understand why having a vector here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on adding some quick comments to explain

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (or something similar) may help to clarify:

spec struct SmartVector {
   invariant vector::length(inline_inner) + vector::length(table_inner) == 1;
}

Copy link
Contributor

@movekevin movekevin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some relatively minor comments. This makes sense in general but I'll take a much closer look after design discussion.

aptos-move/framework/aptos-stdlib/sources/big_vector.move Outdated Show resolved Hide resolved
aptos-move/framework/aptos-stdlib/sources/big_vector.move Outdated Show resolved Hide resolved
assert!(i < len, error::invalid_argument(EINDEX_OUT_OF_BOUNDS));
let val = swap_remove(v, i);
if (i + 1 < len - 1) {
range_reverse(v, i + 1, len - 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using range_reserve instead of just simply shifting every element to the left by one (e.g. swapping)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems better. Yeah, let me change it.

/// Return if an element equal to e exists in the vector v.
/// disclaimer: this function is costly. Use it at your own discretion.
public fun contains<T>(v: &BigVector<T>, val: &T): bool {
if (is_empty(v)) return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this doesn't save much gas vs calling index_of as the main loop there would not run anyway. It can therefore be removed for simplicity.

length(v) == 0
}

#[test_only]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add more tests with more edge cases?

/// `SmartVector` should not enforece that.
struct SmartVector<T> has store {
inline_inner: vector<T>,
table_inner: vector<BigVector<T>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on adding some quick comments to explain

};
destroy(v);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add more unit tests with edge cases?

aptos-move/framework/aptos-stdlib/sources/big_vector.move Outdated Show resolved Hide resolved
Comment on lines 17 to 18
inline_inner: vector<T>,
table_inner: vector<BigVector<T>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first_bucket and extra_buckets sound more expressive, just my 2 cents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed them as inline_vec and big_vec

vector::push_back(&mut v.inline_inner, val);
return
};
let estimated_avg_size = max((size_of_val(&v.inline_inner) + val_size) / (inline_len + 1), 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we generate this estimated_avg_size when creating this smart vector? this value should be the same for every push. doing this once in initialization can save gas

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can take the size of the first element as estimated_avg_size. But will be very inaccurate for T with variable size. TBH I am not sure which is better.

Copy link
Contributor

@wrwg wrwg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks well suited for specification.

/// `SmartVector` should not enforece that.
struct SmartVector<T> has store {
inline_inner: vector<T>,
table_inner: vector<BigVector<T>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (or something similar) may help to clarify:

spec struct SmartVector {
   invariant vector::length(inline_inner) + vector::length(table_inner) == 1;
}

@lightmark lightmark force-pushed the smart_vector branch 2 times, most recently from f74d65e to f7320eb Compare December 3, 2022 02:59
@lightmark
Copy link
Contributor Author

@msmouse looks good to you now?

Comment on lines 27 to 38
public(friend) fun empty<T: store>(bucket_size: u64): BigVector<T> {
assert!(bucket_size > 0, error::invalid_argument(EZERO_BUCKET_SIZE));
BigVector {
buckets: table_with_length::new(),
end_index: 0,
num_buckets: 0,
bucket_size,
}
}

/// Create a vector of length 1 containing the passed in element.
public(friend) fun singleton<T: store>(e: T, bucket_size: u64): BigVector<T> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made these two friend only to make this struct private for friend modules. Is this the right way to do that? @movekevin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do these need to be friend? I.e. why can't these be open for anyone to use?

@lightmark
Copy link
Contributor Author

move two files to move-examples first for quick adoption.

Comment on lines 9 to 14
const EVECTOR_NOT_EMPTY: u64 = 2;
/// bucket_size cannot be 0
const EZERO_BUCKET_SIZE: u64 = 3;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now these are values that'll change for an existing module, is that a problem?

Oh.. I guess we didn't publish these example modules on-chain under aptos_std?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine for these to change even if this had been in aptos-stdlib as well. The error mapping would automatically get re-generated when the module is upgraded.

struct BigVector<T> has store {
buckets: TableWithLength<u64, vector<T>>,
end_index: BigVectorIndex,
end_index: u64,
num_buckets: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is redundant because buckets knows it's length?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, buckets know its length but this index is not bucket index but element index.

Copy link
Contributor

@msmouse msmouse Dec 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean num_buckets @lightmark

vector::swap(table_with_length::borrow_mut(&mut v.buckets, i_bucket_index), i_vector_index, j_vector_index);
return
};
let bucket_i = table_with_length::remove(&mut v.buckets, i_bucket_index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some comments to explain what this block of code is doing?

Copy link
Contributor

@davidiw davidiw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove the framework docs, maybe just a simple cargo build on cached-framework or something. dismiss review when done.

@lightmark
Copy link
Contributor Author

@davidiw done

@lightmark lightmark force-pushed the smart_vector branch 2 times, most recently from 10577c9 to 81e076e Compare December 14, 2022 14:44
@lightmark lightmark enabled auto-merge (squash) December 15, 2022 05:02
@davidiw davidiw dismissed their stale review December 15, 2022 06:58

thanks!

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

✅ Forge suite land_blocking success on fb2775e7addfde643fad83a048f5a688406a2eaa

performance benchmark with full nodes : 6305 TPS, 6271 ms latency, 10800 ms p99 latency,(!) expired 940 out of 2693260 txns
Test Ok

@github-actions
Copy link
Contributor

✅ Forge suite compat success on testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> fb2775e7addfde643fad83a048f5a688406a2eaa

Compatibility test results for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> fb2775e7addfde643fad83a048f5a688406a2eaa (PR)
1. Check liveness of validators at old version: testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b
compatibility::simple-validator-upgrade::liveness-check : 7310 TPS, 5303 ms latency, 7500 ms p99 latency,no expired txns
2. Upgrading first Validator to new version: fb2775e7addfde643fad83a048f5a688406a2eaa
compatibility::simple-validator-upgrade::single-validator-upgrade : 4574 TPS, 9095 ms latency, 11200 ms p99 latency,no expired txns
3. Upgrading rest of first batch to new version: fb2775e7addfde643fad83a048f5a688406a2eaa
compatibility::simple-validator-upgrade::half-validator-upgrade : 4550 TPS, 8998 ms latency, 11900 ms p99 latency,no expired txns
4. upgrading second batch to new version: fb2775e7addfde643fad83a048f5a688406a2eaa
compatibility::simple-validator-upgrade::rest-validator-upgrade : 6276 TPS, 6129 ms latency, 10500 ms p99 latency,no expired txns
5. check swarm health
Compatibility test for testnet_2d8b1b57553d869190f61df1aaf7f31a8fc19a7b ==> fb2775e7addfde643fad83a048f5a688406a2eaa passed
Test Ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants