-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(collector): add statistics for partition hotspot #427
Conversation
为什么要把读写的qps和cu合在一起判断热点呢? |
是拿qps和cu两个维度来衡量哪个partition是热点,不过cu那个用10倍来判断不是特别好,感觉应该调大一点 |
我其实想问的是把读写分开是不是更准确?根据读的qps和cu来算读热点,写的qps和cu来算写热点,读写混合在一起判断的热点会不会不够准确? |
我觉得读和写分开倒不是准确不准确的问题,是可以区分出读热点和写热点。但是做热点功能的出发点是防止访问某些分片的流量过于集中、缓存分片服务被打垮。感觉区分出读和写意义不是很大呢 |
热点统计建议用 policy/strategy 设计模式来实现,这样找相关代码的时候,看到譬如 “TableHotspotPolicy”,可以知道这是做热点统计的代码。这有个好处是,如果未来我们想修改热点判断的策略,你只需要改 设计上你可以 class table_hotspot_policy {
public:
void detect_hotspot (table_statistics stats);
private:
// 实现可以有各种做法,可以用最大最小值,也可以有其他策略
// 可以设计成打分制,超出一定分值就算热点
perf_counter _perf_hotspot_score;
std::string table;
}; |
我也认为用 total_qps 和 total_cu 来判断热点有问题,首先不是读有热点,写就有热点的。可能存在写很均匀但读有热点的情况。 建议的策略是先按 read_cu/write_cu 和每一种操作的 qps 来分别判断热点,每一项可以占 1 分 @hycdong @acelyc111 可以讨论一下这样的策略是否合适。 |
我觉得这个用这个策略模式倒是很好,但是策略模式的用途是当有各种不同的策略时,比如:打分、最大/最小。所以命名这块根据我们确定的策略命名,而不是简单的用table_hotspot_policy。因为不论打分还是最大/最小都属于table_hotspot_policy,table_hotspot_policy可以当做他们的父类来使用 |
@@ -130,88 +130,30 @@ void info_collector::stop() { _tracker.cancel_outstanding_tasks(); } | |||
void info_collector::on_app_stat() | |||
{ | |||
ddebug("start to stat apps"); | |||
std::vector<row_data> rows; | |||
if (!get_app_stat(&_shell_context, "", rows)) { | |||
std::map<std::string, std::vector<row_data>> all_rows; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些只是重构吧?重构的先单独提一个PR便于review吧
duplicate |
What problem does this PR solve?
In online pegasus system, There are some hotspot keys. Which produce unbalanced flow. so statistics for partition hotspot is needed.
What is changed and how it works?
In info collector server, it will get qps/cu for each partition, and count the max/min of them. Using max/min, we can get the scale between max and min. If the scale is greater than 10, we assume the partition has the max value is a hotspot.
Check List
Related changes
yes
no
yes