-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TreeMap impl based on AVL-tree #154
Conversation
@nearmax , please check out the draft (please note it is a draft: I'll catch up with all TODOs) - does it look like what's expected?
Half of CI checks fail (for UPDATE: I also did some refactorings/cleanups (like function extraction to avoid code duplication), hope it is OK. |
Hey @sergey-melnychuk ! Thank you for working on it. Sorry for not reviewing it yesterday, I will review it today. |
@nearmax , no problem, PR is ready for review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address the comments before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: I didn't initially realize that this is a PR for an accepted gitcoin proposal. Editing out parts of the comment which are irrelevant in the context.
What is the rationale behind using the heap?
If we are OK sorting each time we call to iter (which we are likely not), why not just keep a map and a vector, and sort it using whatever sorting algorithm whenever one calls to iter
?
Depending on the API that we want to expose, the ordered map generally needs to allow one or both of the following:
- Iterate over the first (few) elements in the sorted order. The current implementation is not usable for that, because such an iterator is expected to work in
O(k log n)
to fetchk
elements, notO(n log n)
- Iterate over the (few) elements after some key (
lower_bound
orrange
in most languages in which sorted dicts are implemented). Again, the current implementation doesn't support it.
If we want to support either of the scenarios above (and I'm sure we do want to support (1)), we should implement an AVL or a RedBlack tree or something along the lines.
If we actually just want to support iterating that sorts the entire collection we should just have a vector and a map, and sort each time we call to iter
. It will be significantly simpler and cleaner than the Heap
.
Using a heap is certainly not the right tool for this problem? Unless I'm missing something :)
I made a mistake when reviewing gitcoin proposal. I did remember the requirements from the partners, but I forgot, that it is not possible to iterate heap while sorting it (in log N per step) -- which is how I interpreted the proposal initially. Since there is a value in Heap and HeapMap, for the cases when there are order books, but not when entire order book needs to be retrieved, I suggest we keep this code and payout the bounty since it was my fault. But then my next bounty would be to implement AVL, RB tree or BTree. |
Your code is fine. We broke our CI recently: #156 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undoing the approval, since @SkidanovAlex found a bug when the same key is inserted twice.
|
@sergey-melnychuk We will be happy to pay out the bounty, given your code matches your proposal that we accepted. Unfortunately, we haven't realized at the time of acceptance that the heap is not a suitable data structure for the ordered map, and will not be merging the implementation in, seeking further to have a sorted map implementation. |
@nearmax , @SkidanovAlex , thank you for the review, I will look into it today. This is the initial approach, prototype, so work only starts here, not ends :) I was trying to prototype quickly - to get feedback and start for next iteration(s). To summarize:
Next impl must fit these:
IMHO AVL is the best fit in performance/complexity tradeoff. Let me come up with a prototype of AVL based on Maps. |
Thank you, AVL would be the best. And sorry for the mess up. |
@nearmax , nothing to be sorry about - it is much easier to clarify requirements in front of a prototype, than out of thin air :) Added |
@mikhailOK Is this PR good to go, can we merge it? |
@SkidanovAlex @mikhailOK LGTMed this PR. Is there anything from your side you want to comment on? |
The AVL tree looks good to me with all the new tests. Can we remove Heap and the HeapMap? AVL tree is a strict superset of functionality. Let's also get LGTM from Mikhail. |
AFAIU heap has smaller constant in O(log N) than AVL, so it is beneficial to keep around. @mikhailOK already LGTMed, see above. |
Heap is kind of weird right now, it's a priority queue but that also doesn't allow duplicate values. Make sense to remove it or make private until we decide what we want from it. |
@sergey-melnychuk Could you remove the heap? After that we will merge and I will release the bounty. |
@nearmax , heap is removed - I'll submit separate PR with priority queue, then you can decide if it is useful for you. |
@SkidanovAlex could you change your review to "approved" so that I don't force merge? |
@sergey-melnychuk Thank you very much for spending your time on implementing this collection together with us! I have paid out the bounty, please let us know if there any issues with payments. We are very satisfied with the result and hoping for the future collaboration together with you. |
TreeMap based on AVL-tree
get
/contains_key
insert
/remove
min
/max
lower
/higher
range
of K elementsCloses #146