Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replica leader follower sync throughput per partition #1516

Open
qoo332001 opened this issue Feb 24, 2023 · 8 comments
Open

replica leader follower sync throughput per partition #1516

qoo332001 opened this issue Feb 24, 2023 · 8 comments

Comments

@qoo332001
Copy link
Collaborator

qoo332001 commented Feb 24, 2023

在進行實驗時發現一個現象,每個replica follower在跟replica lleader拉資料時(包含replication與reassign),會沒辦法用滿I/O頻寬,大概每個partition寫入最高速度只有200MB/s,如下圖
image

後來發現可以透過調整server端的config: replica.socket.receive.buffer.bytes 來改善這個狀況,如下圖
將replica.socket.receive.buffer.bytes設定成-1(由os決定receive buffer size)
image

為了更進一步調查這個問題做了一些實驗來確認同一個broker中的data folder搬移或是consumer的讀取是否也有這樣的狀況

使用預設的 replica.socket.receive.buffer.bytes

log size consume time consume rate(MB/s) data path migrate time migrate rate(MB/s)
71443680152 291.632 233.630064 111.276 612.297376
68629120395 268.931 243.3703421 98.98500013 661.209571
68955449081 235.099 279.7163781 117.033 561.9016924

更改replica.socket.receive.buffer.bytes=-1

log size consume time consume rate(MB/s) data path migrate time migrate rate(MB/s)
71623441738 242.3769999 281.8148458 91.95000005 742.8541251
69620149637 229.1600001 289.7318406 132.427 501.3701782
123425044401 490.085 240.1773056 253.7119999 463.9405897

結論

這邊我發現不論有沒有更改 replica.socket.receive.buffer.bytes,consumer的效能都不是很好,data path間的搬移則是幾乎都可以全力搬移,所以後面有去試著改consumer的config來讓consumer可以在一個合理的速度(I/O頻寬上限)做poll資料

使用預設的replica.socket.receive.buffer.bytes 且同時更改consumer的receive.buffer.bytes=-1

log size consume time consume rate(MB/s) data path migrate time migrate rate(MB/s)
72898986403 131.7160001 527.8165977 120.3280001 577.7698539
75495782896 131.2089999 548.7305689 159.4100001 451.6554112
79165784915 125.1919999 473.6112902 144.7639999 521.5272847

結論

經過上面這些實驗,可以發現server端的replica.socket.receive.buffer.bytes,似乎與consumer的效能或data path間的搬移速度沒有直接的關係,cosumer處理的速度似乎還是要看consumer自己的buffer size,而不是server端的replica receive buffer size

@chia7712
Copy link
Contributor

所以可否給一個結論?例如在平行度只有1的狀況下,要達到提升 fetch 的速度要如何調整哪些參數

@qoo332001
Copy link
Collaborator Author

qoo332001 commented Feb 25, 2023

例如在平行度只有1的狀況下,要達到提升 fetch 的速度要如何調整哪些參數

  • Leader follower的fetch速度(包含reassign): Broker config的replica.socket.receive.buffer.bytes
  • consumer fetch速度: Consumer config的 receive.buffer.bytes
  • data folder fetch: 預設config就可以用完disk I/O

@chia7712
Copy link
Contributor

寫得很好,可否讓我們延伸這個價值

寫一份 kafka Q&A 把這個觀察到的現象寫上去?這樣之後的人都可以排除這個問題

@chia7712
Copy link
Contributor

後來發現可以透過調整server端的config: socket.receive.buffer.bytes 來改善這個狀況

#1518 中我有提問說現在究竟是 socket.receive.buffer.bytes 還是 replica.socket.receive.buffer.bytes 哪個影響?從實驗來看好像是socket.receive.buffer.bytes,但直覺怪怪的,因為 replica sync 用的 socket 跟一般 request 用的 socket 分開,所以應該是replica.socket.receive.buffer.bytes有影響才對

@qoo332001
Copy link
Collaborator Author

因為 replica sync 用的 socket 跟一般 request 用的 socket 分開,所以應該是replica.socket.receive.buffer.bytes有影響才對

先前的實驗一直都是把socket.receive.buffer.bytes設為-1,然後分別測試了replica.socket.receive.buffer.bytes為預設值,發現有沒有設定其實沒什麼影響,簡單跑一下實驗:

不設定socket.receive.buffer.bytes
image

設定socket.receive.buffer.bytes=-1
image

@qoo332001
Copy link
Collaborator Author

@chia7712 更新了一下,麻煩再看一次,感謝!

@chia7712
Copy link
Contributor

描述中的表格中的data path migrate time and migrate rate(MB/s)分別是什麼?可否說明一下

另外假如migrate rate是指 replica sync 的話,replica.socket.receive.buffer.bytes的調整似乎影響不大?

@qoo332001
Copy link
Collaborator Author

qoo332001 commented Feb 26, 2023

描述中的表格中的data path migrate time and migrate rate(MB/s)分別是什麼?可否說明一下

data path migrate time: data path之間的搬移時間
migrate rate(MB/s): data folder間 (資料量/搬移的時間),也就是data folder間的搬移速度

另外假如migrate rate是指 replica sync 的話,replica.socket.receive.buffer.bytes的調整似乎影響不大?

上面migrate rate指的是data folder間的搬移,所以沒有影響,調整replica.socket.receive.buffer.bytes,主要有影響的是 replica sync rate(最上面的圖)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants