Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NFT Metadata Crawler] Asset Uploader Throttler #14904

Merged

Conversation

just-in-chang
Copy link
Contributor

@just-in-chang just-in-chang commented Oct 9, 2024

Description

Implementation of the Asset Uploader Throttler.

Design: https://www.notion.so/aptoslabs/WIP-Asset-Uploader-cb33138051324d1fa03115ef1b55ed0b

How Has This Been Tested?

Key Areas to Review

src/asset_uploader/throttler/mod.rs

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Move Compiler
  • Other (specify)

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

@just-in-chang just-in-chang requested a review from jillxuu October 9, 2024 10:55
Copy link

trunk-io bot commented Oct 9, 2024

⏱️ 22m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-move-tests 9m 🟩
rust-move-tests 4m
rust-cargo-deny 4m 🟩🟩
check-dynamic-deps 2m 🟩🟩
general-lints 59s 🟩🟩
semgrep/ci 44s 🟩🟩
file_change_determinator 35s 🟩🟩
permission-check 8s 🟩🟩
permission-check 4s 🟩🟩

settingsfeedbackdocs ⋅ learn more about trunk.io

let mut queue = self_clone.asset_queue.lock();
queue.insert(asset);
};
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its possible that server restarts when asset from in progress queue was already sent for uploading. so you might run into errors where it's already uploaded, so failed to upload again but DB wasn't updated yet due to server restart. we might need to handle that error specifically and write to DB even if it errors (also make a GET to cloudflare to get the uploaded url?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a lot of error handling around asset already exists 409 (5409) errors:

  • If the case you described happens, the result is the asset existing in Cloudflare but not the DB. I introduce another endpoint to the worker that performs a best effort "lookup" operation to see if the asset_uri exists in Cloudflare. It needs to parse through the list of ALL images in Cloudflare to see if the asset we want is already there (there isn't a lookup option 😡). We help the lookup by additionally writing the original asset_uri in the metadata for each uploaded asset. This is an expensive operation, but it should basically never be used.
  • For other 409s, we attempt to first grab the data from the parsed_aasset_uris table to see if there's already an exiting CDN URI there.

}

impl Server for AssetUploaderThrottlerContext {
fn build_router(&self) -> axum::Router {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm wonder do we need to have a fixed threads like 4 to upload assets? right now we're single threaded in asset uploading

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The uploading is actually all spun off onto a separate thread.

The only thing sequential about this is getting the next asset to upload, which is needed bc we want all the uploading to be in order

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remind me why we want all uploading to be in order? for fairness?

Comment on lines 57 to 58
asset_queue: Arc<Mutex<BTreeSet<AssetUploaderRequestStatusesQuery>>>,
in_progress_assets: Arc<Mutex<AHashSet<AssetUploaderRequestStatusesQuery>>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use single lock for both queues to avoid possible race conditions btw queue updates and uploads? where an asset could be added to the queue by update_queue just as it's being removed and processed by handle_upload_assets? chatgpt suggested https://docs.rs/crossbeam-queue/latest/crossbeam_queue/ as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oki doki

@@ -0,0 +1,253 @@
// Copyright © Aptos Foundation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i assume we only will have single throttler instance anytime with in memory queue? let say in future we replace cloudflare with another provider without rate limit concerns, i guess we can even safely remove throttler?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes and yes 👍

Copy link
Contributor

@jillxuu jillxuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm! how to verify biz logics work as intended tho?

}

impl Server for AssetUploaderThrottlerContext {
fn build_router(&self) -> axum::Router {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remind me why we want all uploading to be in order? for fairness?

@just-in-chang just-in-chang merged commit 416cb79 into justin/asset-uploader-api Oct 22, 2024
43 of 47 checks passed
@just-in-chang just-in-chang deleted the justin/asset-uploader-throttler branch October 22, 2024 01:10
just-in-chang added a commit that referenced this pull request Oct 22, 2024
* s

* fix

* edits

* update model

* upd oop

* [NFT Metadata Crawler] Asset Uploader Throttler (#14904)

* yay

* upsert

* awesome

* AHHHHHHHHHHHHHHHHHHHHHHH

* boom

* boom

* lint

* lint
just-in-chang added a commit that referenced this pull request Oct 22, 2024
* s

* fix

* edits

* update model

* upd oop

* [NFT Metadata Crawler] Asset Uploader Throttler (#14904)

* yay

* upsert

* awesome

* AHHHHHHHHHHHHHHHHHHHHHHH

* boom

* boom

* lint

* lint
just-in-chang added a commit that referenced this pull request Oct 22, 2024
* squash

* rename

* address comments

* [NFT Metadata Crawler] Asset Uploader API (#14843)

* s

* fix

* edits

* update model

* upd oop

* [NFT Metadata Crawler] Asset Uploader Throttler (#14904)

* yay

* upsert

* awesome

* AHHHHHHHHHHHHHHHHHHHHHHH

* boom

* boom

* lint

* lint

* lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants