Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spike/dega 1748 #7

Merged
merged 16 commits into from
Mar 25, 2024
Merged

Spike/dega 1748 #7

merged 16 commits into from
Mar 25, 2024

Conversation

arodionov53
Copy link

Adding functionality for boolean expression factorization - running several matches against several be-trees in sequence.
See Troubleshooting tools: Targeting, Creatives, and Frequency Capping.

@arodionov53 arodionov53 requested a review from a team as a code owner March 20, 2024 15:35
@vs-ads
Copy link

vs-ads commented Mar 21, 2024

My comments are rather suggestions, they are not conditions for approval.
It is at your discretion to follow them.
In the order of importance.

  1. It is not clear how the high-level business requirements the PR refers to,
    https://adgear.atlassian.net/browse/PEDSP-1431,
    https://adgear.atlassian.net/wiki/spaces/AGILE/pages/19867599792/Troubleshooting+tool+Internal+loss+reasons,
    connected to the implementation.
    If I wouldn't participate in one discussion before I wouldn't know how.
    Would be beneficial to have a Tech document mapping business requirements to the implementation;

  2. It is not clear what are use cases for the functionality introduced by the PR because of the absence of a Tech doc.
    I can only guess what the use cases are.
    I pushed the commit with the tests to verify that I understand the main use case properly.
    If the tests do not reflect the intended use case please let me know. I will revoke the commit then.
    7073b52

  3. erl_betree:betree_make_event creates MEM_EVENT resource.
    It could make search for the event at the same invocation as well to save one round trip to the NIF;

  4. When Ids are returned from NIF they also can be kept with MEM_EVENT resource.
    This would allow to avoid passing the Ids back to the NIF and unpack them from the list, like in
    betree.c, ERL_NIF_TERM nif_betree_search_evt_ids(..., lines 1087-1101

  5. c_src/build_betree script refers to v1.2.1 tag of http://github.com/adgear/be-tree
    tag v1.3.0 on be-tree would reflect that there are major (feature level) changes.

  6. When resolving merge conflicts I didn't notice that order of functions in betree.c get re-arranged and logically related functions are far from each other.
    Let me know if you want me to address that and I'll put them in the logically consistent order.

  7. I extended benchmarks to use erl_betree:betree_make_event functionality.
    Let me know if you think that this commit should not be a part of the PR. I will revoke the commit then.
    83bc082 (edited)

@arodionov53
Copy link
Author

arodionov53 commented Mar 21, 2024

Victor, thank you for your comments.

  1. I can only agree. We need to defined what information we collecting, where we put it and for what reason. This will be very helpful for teams that will work with the data.
  2. I think that we have to change your example to make it more realistic.
  3. Not sure that I do understand you fully The reason for creating this structure is that bid request contains information about almost one hundred variables, some on which may have list of thousand elements long. Converting this information from Erlang to C structures takes considerable time. So not to repeat this process for every be-tree in the sequence I do it once at the very beginning.
    [VS] My suggestion is to extend erl_betree:betree_make_event functionality:
  • MEM_EVENT resource will be returned as in current implementation;
  • before returning the MEM_EVENT to perform the first search.
    [AR] The first step is the longest one and I want to make time spend in NIF shorter. For this reason I did not incorporate MEM_EVENT creation in the first step.
  1. I considered this option. The reason why I do not go this way is: 1. these ids are required in Erlang for reporting. 2. The lists are short and can be converted to C structures fast. I have some ideas at this point how to improved performance close to your suggestion but decided to make later.
    [VS] My suggestion is to extend:
  • along with returning the Ids to Erlang level;
  • save these Ids for the next use in MEM_EVENT;
    [AR] - This definitely can be done but then there are two choices:
  • Creating a new copy of MEM_EVENT - then some extra cost will be paid to GC
  • Modify MEM_EVENT - I tried to follow principles of functional programming
  • But the main reason - I do not feel that this can give much .
  1. You are right I will change this. The reason why I choose 1.2. is that you changes already made a huge change.

  2. Please do. Thank you.
    [VS] Ok.

  3. What are results? I'm not sure that I understand your test.
    [VS] The benchmarks:

  • provide observations for memory allocations;
  • collect statistics how much time is spent to evaluate events;
  • compare statistics for different implementations;
  • write evaluations outputs to file;
  • compare evaluations output for different implementations.
    [AR] The problem is that we need events created in production. They may contain huge data structures (mostly lists).

@arodionov53 arodionov53 merged commit c1cd823 into master Mar 25, 2024
1 of 2 checks passed
@arodionov53 arodionov53 deleted the spike/DEGA-1748 branch March 25, 2024 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants