Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[调研] droplet batch deal design / 水滴批量发单需求收集 #5809

Closed
Tracked by #5810 ...
Fatman13 opened this issue Mar 10, 2023 · 8 comments
Closed
Tracked by #5810 ...

[调研] droplet batch deal design / 水滴批量发单需求收集 #5809

Fatman13 opened this issue Mar 10, 2023 · 8 comments
Assignees
Labels
C-dev-productivity Category: Developer productivity CU-deal-service Venus storage deal service related issues design Venus feature/system design issues

Comments

@Fatman13
Copy link
Contributor

Fatman13 commented Mar 10, 2023

背景

在批量发单的设计上,如何更加配合当前 f+ 对 LDN 发单要求。辅助发单人,方便其按照 f+ 的要求去发单。

想法

  • 对应一个 ldn,一个client地址,这个client发单需要满足以下一些条件;软件如何辅助发单人满足这些条件?
    • Storage provider should not exceed 25% of total datacap.
    • Storage provider should not be storing duplicate data for more than 20%.
    • Storage provider should have published its public IP address.
    • All storage providers should be located in different regions.
  • 直接借鉴PL做的CID checker,来辅助发单人来满足ldn对发单的需求;PL的checker
  • 结合正在和 boost 讨论协议,一并看看协议有什么可以改进的地方?

cc @simlecode @hunjixin @Joss-Hua @cloudxin

@Fatman13 Fatman13 added C-dev-productivity Category: Developer productivity design Venus feature/system design issues CU-deal-service Venus storage deal service related issues labels Mar 10, 2023
@Fatman13 Fatman13 self-assigned this Mar 10, 2023
@Fatman13 Fatman13 mentioned this issue Mar 10, 2023
39 tasks
@simlecode
Copy link
Collaborator

#5261

@laurarenpanda
Copy link

laurarenpanda commented Mar 13, 2023

All storage providers should be located in different regions.

是否可以考虑根据SPs的总数量来设置不同地区的数量峰值?
例如:

  1. 4个SPs,每地区限1;
  2. 7个SPs,每地区限2;
  3. 10个SPs,每地区限3。

保证数据被分散在多地存储的同时,更贴合现阶段FIL+存储提供商实际的地理分区情况。

@hunjixin
Copy link
Contributor

这样可能需要给sp关联一个地区信息

@Fatman13
Copy link
Contributor Author

这个地区信息感觉不好维护?得用户自己输入?

@laurarenpanda
Copy link

这样可能需要给sp关联一个地区信息

地区信息可能存在一种情况,是该SP首次接收数据存储订单,即Checker Bot并未记录它的所在地,那么就需要手动输入或通过其他方式获取信息。
所以在节点地区分散策略上,是否可以考虑仅给Client提供参考?而并不在发单软件中进行限制。

@Fatman13
Copy link
Contributor Author

感觉是不是得出个设计文档?类似这样的

@Fatman13 Fatman13 mentioned this issue Apr 10, 2023
37 tasks
@elvin-du elvin-du mentioned this issue Apr 20, 2023
32 tasks
@simlecode
Copy link
Collaborator

Simple Summary (概述)

Client 目前一次只能发一个订单,需要一个工具能够支持 Client 批量发离线订单。

Abstract (功能简介)

Client 能够基于 go-graphsplit 工具批量发布离线订单,并把订单相关信息存储到数据库。对于 Client 发布的 DC 订单要符合 f+ 对于 LDN 的发单要求。

关联:

Motivation (来源/背景)

Client 有大量离线订单时,使用 venus-market 命令行命令一次只能发布一个订单,Client 需要重复执行发单操作,发单的效率并不高,批量发单在此情景下就能减少发单次数和时间。

Specification (feature Spec)

  • 批量发单:指批量发布离线订单
  • 订单:分普通订单和 DC 订单,DC 也就是被分配了 datacap 订单
  • Client:订单发起者
  • SP:存储提供者
  • 发单地址:发单时使用的地址,可以是具有 datacap 的地址
  • go-graphsplit:一种用于将数据集切分成固定大小的 car 文件的工具
  • f+ 对 LDN 发单要求:
    • Storage provider should not exceed 25% of total datacap -- Client 给一个 SP 发的 datacap 订单不能超过总 datacap 的 25%。
      • 统计 SP 具有 datacap 订单数据占 Client datacap 总量的比例,发单时检查占比。
    • Storage provider should not be storing duplicate data for more than 20% -- SP 不能超过 20% 的重复数据。
      • 统计相同 piece cid 的数据占总 piece 数据的比例,发单时检查占比。
    • Storage provider should have published its public IP address.
    • All storage providers should be located in different regions -- SP 需要位于不同地区。
      • SP 绑定一个地区
  • 订单持久化:把批量发的离线订单存储到数据库
  • 新增命令:
    • 批量发单命令
    • 查询地址拥有 datacap 可用余额
    • 查询离线订单信息
  • 循环检查链上订单状态,更新本地订单状态,若有过期或者 slash 订单,更新 SP piece 数据占比。

Design Rationale (设计思路)

根据用户使用 go-graphsplit 生成的 manifest.csv 文件内容来批量生成离线订单。

Backwards Compatibility(兼容性)

  1. 需要兼容已有的发单逻辑
  2. 已有查询接口和命令可以获取在线订单数据,对于离线订单数据是用现有查询接口和命令?

Test Cases (测试用例)

Security Considerations (安全考量)

Implementation(实施)

命令行命令

批量发单

./market-client storage deals batch -h

NAME:
   market-client storage deals batch - Batch storage deals with a miner

USAGE:
   market-client storage deals batch [command options] [car-dir miner price duration]

DESCRIPTION:
   Make deals with a miner.
   miner is the address of the miner you wish to make a deal with.
   price is measured in FIL/Epoch. Miners usually don't accept a bid
   lower than their advertised ask (which is in FIL/GiB/Epoch). You can check a miners listed price
   with './market-client storage asks query <miner address>'.
   duration is how long the miner should store the data for, in blocks.
   The minimum value is 518400 (6 months).

OPTIONS:
   --fast-retrieval             indicates that data should be available for fast retrieval (default: true)
   --from value                 specify address to fund the deal with
   --manifest value             piece cid file
   --provider-collateral value  specify the requested provider collateral the miner should put up
   --start-epoch value          specify the epoch that the deal should start at (default: -1)
   --verified-deal              indicate that the deal counts towards verified client total (default: true if client is verified, false otherwise)

car-dir 保存 car 文件的目录

--manifest 订单 piece cid 文件,一行一个 piece cid,按顺序逐个发。

查询离线订单

  1. 按 proposal cid 查询单个离线订单信息
  2. 列出所有离线订单信息

持久化订单数据

把离线订单数据存储到 badger 中,key 为 proposal cid, value 为 JSON 格式的数据。具体数据结构如下:

ClientDealProposal

type ClientOfflineDeal struct {
	ClientDealProposal ClientDealProposal

	ProposalCID cid.Cid
	Message     string
	State       int
	SlashEpoch  abi.ChainEpoch
	CreatedAt   time.Time
	UpdatedAt   time.Time
}

接口:

type ClientOfflineDealRepo interface {
	SaveDeal(ctx context.Context, deal *ClientOfflineDeal) error
	GetDeal(ctx context.Context, proposalCid cid.Cid) (*ClientOfflineDeal, error)
	ListDeal(ctx context.Context) ([]*ClientOfflineDeal, error)
}

批量发布订单接口

DealsParams

ClientBatchDeal(ctx context.Context, params *client.DealsParams) (*client.DealResults, error) //perm:write

type DealResults struct {
	Results []*DealResult
}

type DealResult struct {
	ProposalCID cid.Cid
	// Create deal failed
	Message string
}

type DealsParams struct {
	Params []*DealParams
}

更新本地订单状态

通过循环调用接口 StateMarketDeals 来获取所有 active 订单,根据链上订单状态来更新本地离线订单状态。

所有datacap订单piece数据统计

  1. 统计 SP piece 数据重复率,重复率 = 不重复piece数据 / SP所有piece数据。
假如有两个 piece1,三个 piece3,则重复率 = 1 - (piece1+piece3) / (2 * piece1 + 3 * piece3)
  1. 统计发单地址分配给 SP 的 datacap 占比

若该统计在 market-client 端计算,则需要提供一个接口给批量发单时调用,若在批量发单时计算,则每次都要拿所有订单然后统计。

# 获取 datacap 订单piece分布
ClientGetVerifiedDealDistribution(ctx context.Context) (*PieceDistribution, error) 

type ProviderDistribution struct {
	Provider              address.Address
	Total                 uint64
	UniqPieces            map[string]uint64
	DuplicationPercentage float64
}

type ReplicaDistribution struct {
	Client                     address.Address
	Total                      uint64
	ProvidersPieceDistribution []ProviderDistribution
}

type DealDistribution struct {
	ProvidersDistribution []ProviderDistribution
	ReplicasDistribution  []ReplicaDistribution
}

@Fatman13
Copy link
Contributor Author

应该算完成了?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-dev-productivity Category: Developer productivity CU-deal-service Venus storage deal service related issues design Venus feature/system design issues
Projects
Status: Done
Development

No branches or pull requests

4 participants