Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polish hash function of executor cache key #29556

Merged
merged 3 commits into from
Dec 11, 2020

Conversation

Aurelius84
Copy link
Contributor

@Aurelius84 Aurelius84 commented Dec 10, 2020

PR types

Bug fixes

PR changes

OPs

Describe

Polish hash function of executor cache key

BUG描述:

在CI执行部分动转静单测时,存在一定几率的随机挂,retry之后正常。

原因:

之前使用了 ProgramDesc *指针作为了 executor_cache内在unordered_map的 Key。可能存在两个单测用例中的ProgramDesc指针地址一样的情况,导致执行Program A时,命中了Program B的缓存Context,因此报错。

修复:

  1. 在hash函数计算中,增加更多的Program字段以避免hash冲突。

    • 由于hash函数经常被调用,考虑性能问题,此处并没有使用program->Proto的序列化string作为字段。频繁序列化的overhead不容忽视
    • 采用了 program_desc + block_desc + op_size (每个block_desc)作为hash计算字段。
  2. 移除了指针作为map的key,选择hash_value作为key,保持外层接口 std::pair<program_desc*, bool>不变

    • 因为program_desc指针指向物会在单元用例执行完后被析构,若以指针作为key,则存在通过key访问已析构指向物的潜在风险
    • 移除指针作为map的key,禁止通过key访问到program_desc*指针

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

*/
hash_combine(&seed, prog_desc);
for (size_t i = 0; i < prog_desc->Size(); ++i) {
hash_combine(&seed, &prog_desc->Block(i));
Copy link
Contributor

@liym27 liym27 Dec 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hash_combine(&seed, &prog_desc->Block(i));
It seems to use a pointer of a block_desc but not a BlockDesc object to compute hash value. Is this as expected ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, prog_desc->Block(i) returns BlockDesc, it also hold a Proto::Message. It's not recommended to hash a Message object from Community discussion.

It's not a very good idea to use proto message as keys because they are not true value types. With the presence of unknown fields, two equal messages in one binary might be treated as different by another, and therefore two different binaries might get two different unordered_map<> even given the exactly same input data (if the hash function is implemented using MessageDifferencer). It's probably better to write your own hash function that has your desired behavior rather than replying on a generic one.

And Block(i) is creted by new,it will not lead to hash conflict while considering both address of ProgramDesc and its BlockDesc (s) and inner Op_size.

@Aurelius84 Aurelius84 requested a review from liym27 December 11, 2020 05:54
Copy link
Member

@zhhsplendid zhhsplendid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM while discussed TODO.

@Aurelius84 Aurelius84 merged commit 2a42250 into PaddlePaddle:develop Dec 11, 2020
Aurelius84 added a commit to Aurelius84/Paddle that referenced this pull request Dec 25, 2020
* Add more value to calculate hash key

* fix size_t

* polish code
@Aurelius84 Aurelius84 deleted the fix_executor_cache_key branch April 12, 2021 03:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants