Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: configurable block brokers #280

Merged
merged 5 commits into from
Oct 10, 2023
Merged

feat: configurable block brokers #280

merged 5 commits into from
Oct 10, 2023

Conversation

achingbrain
Copy link
Member

Adds configurable block providers to allow using bitswap but also other methods such as trustless gateways and any yet-to-be-invented way of resolving a CID to a block.

There are two block providers out of the box - bitswap and a trustless gateway provider that tries to download the block from one of a set of preconfigured gateways.

The block is verified after the provider has provided it. This lets users call helia.blockstore.put without verification for performance but anything coming from the network will be verified.

Comment on lines 21 to 33
hashLoader: {
getHasher: async (codecOrName: string | number): Promise<MultihashHasher<number>> => {
const hasher = hashers.find(hasher => {
return hasher.code === codecOrName || hasher.name === codecOrName
})

if (hasher != null) {
return hasher
}

throw new Error(`Could not load hasher for code/name "${codecOrName}"`)
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bitswap already verifies blocks pulled from the network. If we like the approach in this PR we should PR bitswap to skip verification and let it be handled here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we did remove verification from bitswap, we should only do it via configuration flag (verifyBlocks (default true)) so current users don't have a hard migration to do validation themselves.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd release a major. Despite it being published for consumption outside IPFS projects realistically I don't anyone else is using it bar a few weekend hackers.


async get (cid: CID, options: AbortOptions & ProgressOptions<TrustlessGatewayGetBlockProgressEvents> = {}): Promise<Uint8Array> {
// choose a gateway
const url = this.gateways[Math.floor(Math.random() * this.gateways.length)]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Randomly chosen gateway. If a gateway returns a 5xx we may wish to temporarily remove it from rotation and try another?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we definitely should enable some ability to retry other given gateways. IIRC, we discussed in https://pl-strflt.notion.site/Helia-reliable-retrieval-technical-design-golden-path-255-15c4d3c25a404a85b6db8bf3f8d1f310?pvs=4 that just spamming all gateways would be fine for now because we're trying to optimize for successful retrieval.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but I had a moment of doubt when I realised we'd be making 50 HTTP requests for a 10 block DAG.

/**
* Notify a block provider that a new block is available
*/
notify(cid: CID, block: Uint8Array, options?: NotifyProgressOptions): void
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this is only used by bitswap to fulfil want list requests from other peers. I guess you could argue that this BlockProvider is now Providing Blocks to the network? Or we could factor it out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't mentally reconcile a BlockProvider with a notify method. This is why I didn't pull out bitswap in my PR, but naming is hard and I think supporting generic bitswap functionality is more important than getting the naming perfect.

I guess you could argue that this BlockProvider is now Providing Blocks to the network

This line of thought makes sense to me. We should make sure it's clear to consumers that not all blockProviders need to support a "notify" method.

Adds configurable block providers to allow using bitswap but also other
methods such as trustless gateways and any yet-to-be-invented way of
resolving a CID to a block..
Copy link
Member

@SgtPooki SgtPooki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely more in the direction I wanted to go, but I wasn't sure how to reconcile bitswap as a "blockProvider". comments on that on appropriate lines.

I think a better name would help future devs and consumers, and I'm more inclined to unblock Helia-retrieval now rather than prevent Helia API difficulties in the future, but we should decide on a good name now since we’re exposing that to consumers.

A few other things I’ve thought of w.r.t. naming and future-proofing:

  • We could have a more generic BlockExchanger interface that doesn’t imply a direction; A BlockProvider that implies 1. provide to others (notify) and 2. provides to you (get) is confusing to me.
  • We could use naming similar to go-bitswap and js-ipfs-bitswap (“data trading module”), i.e. BlockTrader, BlockBroker
  • BlockBroker (my preferred name) could consist of both a BlockRetriever (get agent) and a BlockAnnouncer (notify agent) interface, and NetworkedStorage should deal only in BlockBrokers. (this could be done in future PRs)
  • We could use BlockProviderStrategy, RemoteBlockStrategies, or BlockStrategies to align with the description you wrote in packages/helia/src/index.ts

I'll put up a PR with some of the changes I've mentioned in comments so they can be pulled in without much edit if desired.

Comment on lines 21 to 33
hashLoader: {
getHasher: async (codecOrName: string | number): Promise<MultihashHasher<number>> => {
const hasher = hashers.find(hasher => {
return hasher.code === codecOrName || hasher.name === codecOrName
})

if (hasher != null) {
return hasher
}

throw new Error(`Could not load hasher for code/name "${codecOrName}"`)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we did remove verification from bitswap, we should only do it via configuration flag (verifyBlocks (default true)) so current users don't have a hard migration to do validation themselves.

/**
* Notify a block provider that a new block is available
*/
notify(cid: CID, block: Uint8Array, options?: NotifyProgressOptions): void
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't mentally reconcile a BlockProvider with a notify method. This is why I didn't pull out bitswap in my PR, but naming is hard and I think supporting generic bitswap functionality is more important than getting the naming perfect.

I guess you could argue that this BlockProvider is now Providing Blocks to the network

This line of thought makes sense to me. We should make sure it's clear to consumers that not all blockProviders need to support a "notify" method.

packages/helia/src/index.ts Show resolved Hide resolved
packages/helia/src/utils/networked-storage.ts Outdated Show resolved Hide resolved
packages/helia/src/utils/networked-storage.ts Outdated Show resolved Hide resolved

async get (cid: CID, options: AbortOptions & ProgressOptions<TrustlessGatewayGetBlockProgressEvents> = {}): Promise<Uint8Array> {
// choose a gateway
const url = this.gateways[Math.floor(Math.random() * this.gateways.length)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we definitely should enable some ability to retry other given gateways. IIRC, we discussed in https://pl-strflt.notion.site/Helia-reliable-retrieval-technical-design-golden-path-255-15c4d3c25a404a85b6db8bf3f8d1f310?pvs=4 that just spamming all gateways would be fine for now because we're trying to optimize for successful retrieval.

packages/helia/src/index.ts Show resolved Hide resolved
@achingbrain achingbrain changed the title feat: configurable block providers feat: configurable block brokers Oct 10, 2023
Copy link
Member

@SgtPooki SgtPooki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Alex 👍

@achingbrain achingbrain merged commit 0749cbf into main Oct 10, 2023
15 checks passed
@achingbrain achingbrain deleted the feat/block-providers branch October 10, 2023 16:59
This was referenced Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

2 participants