Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for BTreeMap cursors #107540

Open
1 of 3 tasks
Amanieu opened this issue Feb 1, 2023 · 68 comments
Open
1 of 3 tasks

Tracking Issue for BTreeMap cursors #107540

Amanieu opened this issue Feb 1, 2023 · 68 comments
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@Amanieu
Copy link
Member

Amanieu commented Feb 1, 2023

Feature gate: #![feature(btree_cursors)]

ACP: rust-lang/libs-team#141

This is a tracking issue for the Cursor and CursorMut types for BTreeMap.

A Cursor is like an iterator, except that it can freely seek back-and-forth, and can safely mutate the tree during iteration. This is because the lifetime of its yielded references is tied to its own lifetime, instead of just the underlying tree. A cursor either points to an element in the tree, or to a "ghost" non-element that is logically located after the last element and before the first element.

Public API

impl<K, V> BTreeMap<K, V> {
    fn lower_bound<Q>(&self, bound: Bound<&Q>) -> Cursor<'_, K, V>
    where
        K: Borrow<Q> + Ord,
        Q: Ord;
    fn lower_bound_mut<Q>(&mut self, bound: Bound<&Q>) -> CursorMut<'_, K, V>
    where
        K: Borrow<Q> + Ord,
        Q: Ord;
    fn upper_bound<Q>(&self, bound: Bound<&Q>) -> Cursor<'_, K, V>
    where
        K: Borrow<Q> + Ord,
        Q: Ord;
    fn upper_bound_mut<Q>(&mut self, bound: Bound<&Q>) -> CursorMut<'_, K, V>
    where
        K: Borrow<Q> + Ord,
        Q: Ord;
}

struct Cursor<'a, K: 'a, V: 'a>;

impl<'a, K, V> Cursor<'a, K, V> {
    fn move_next(&mut self);
    fn move_prev(&mut self);

    fn key(&self) -> Option<&'a K>;
    fn value(&self) -> Option<&'a V>;
    fn key_value(&self) -> Option<(&'a K, &'a V)>;

    fn peek_next(&self) -> Option<(&'a K, &'a V)>;
    fn peek_prev(&self) -> Option<(&'a K, &'a V)>;
}

struct CursorMut<'a, K: 'a, V: 'a>;

impl<'a, K, V> CursorMut<'a, K, V> {
    fn move_next(&mut self);
    fn move_prev(&mut self);

    fn key(&self) -> Option<&K>;
    fn value(&self) -> Option<&V>;
    fn value_mut(&mut self) -> Option<&mut V>;
    fn key_value(&self) -> Option<(&K, &V)>;
    fn key_value_mut(&self) -> Option<(&K, &mut V)>;

    unsafe fn key_mut_unchecked(&mut self) -> Option<&mut K>;

    fn peek_next(&self) -> Option<(&K, &V)>;
    fn peek_prev(&self) -> Option<(&K, &V)>;

    fn as_cursor(&self) -> Cursor<'_, K, V>;

    fn insert_after(&mut self, key: K, value: V);
    fn insert_before(&mut self, key: K, value: V);

    unsafe fn insert_after_unchecked(&mut self, key: K, value: V);
    unsafe fn insert_before_unchecked(&mut self, key: K, value: V);

    fn remove_current(&mut self) -> Option<(K, V)>;
    fn remove_current_and_move_back(&mut self) -> Option<(K, V)>;
}

Steps / History

Unresolved Questions

  • None yet.

Footnotes

  1. https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html

@Amanieu Amanieu added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 1, 2023
@workingjubilee workingjubilee added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label May 5, 2023
@Bwallker
Copy link

Bwallker commented May 7, 2023

The key method on the Cursor and CursorMut types should be unsafe. If the key is interiorly mutable you could mutate it through the shared reference and break the invariants of BtreeMap. For example if the key is of type Cell<u32> I could call key.set and change the value from 1 to 1 billion and break potentially both the "Key must be unique" and "Key must be correctly ordered" invariants of BTreeMap.

@parasyte
Copy link

parasyte commented May 7, 2023

As I understand it, the BTreeMap invariants are not safety invariants. If the key is changed from underneath the map, it can lead to bugs. But crucially not to memory safety violations.

The standard library docs make this distinction clear:

It is a logic error for a key to be modified in such a way that the key’s ordering relative to any other key, as determined by the Ord trait, changes while it is in the map. This is normally only possible through Cell, RefCell, global state, I/O, or unsafe code. The behavior resulting from such a logic error is not specified, but will be encapsulated to the BTreeMap that observed the logic error and not result in undefined behavior. This could include panics, incorrect results, aborts, memory leaks, and non-termination.

@Bwallker
Copy link

Bwallker commented May 7, 2023

But the key_mut_unchecked method on CursorMut is unsafe because you might break the invariants of BTreeMap

@parasyte
Copy link

parasyte commented May 7, 2023

Unless there is a memory safety invariant with &mut K, I don't think it needs to be unsafe. A counterexample is BTreeMap::get_key_value(), which is safe to call, even if the key has interior mutability.

@Amanieu
Copy link
Member Author

Amanieu commented May 7, 2023

The invariant here is that if a function receives a BTreeMap<K, V> where K is a type with a "trusted" Ord implementation (such as usize) then it is guaranteed that the items in the BTreeMap will be in correctly sorted order.

@Bwallker
Copy link

Bwallker commented May 7, 2023

You should clarify this in the docs. It's not really clear why key_mut_unchecked is unsafe reading the docs IMO

@parasyte
Copy link

parasyte commented May 7, 2023

FWIW, I agree that the docs are unclear. Copied directly from rust-lang/libs-team#141:

    /// Returns a mutable reference to the of the element that the cursor is
    /// currently pointing to.
    ///
    /// This can be used to modify the key, but you must ensure that the
    /// `BTreeMap` invariants are maintained. Specifically:
    ///
    /// * The key must remain unique within the tree.
    /// * The key must remain in sorted order with regards to other elements in
    ///   the tree.
    unsafe fn key_mut_unchecked(&mut self) -> Option<&mut K>;

This does not explain what exactly is "unchecked". It appears to just repeat the invariants listed in the BTreeMap docs, which claim there is no possibility of UB by breaking them.

@clarfonthey
Copy link
Contributor

So, I've been trying to create my own "range map" using this API and it feels… extremely clunky to me. Since I'm mostly working with the mutable API, I'll be talking about CursorMut specifically, but I suppose that some of these can apply to Cursor as well:

The handling of "ghost elements" is extremely clunky to me. From what I understand, these only can show up at the beginning and end of the map, since all entries into the API are via lower_bound and upper_bound which by their nature will "snap" to an existing, non-element. Personally, what would make the most sense to me is to require the cursor to always point to a valid element and return an entry-like enum for the lower_bound and upper_bound methods. This way, you can still distinguish between whether you're at the edges of the map on the initial creation of the cursor while not having to handle this case for all other uses of the cursor.

In that line of reasoning, I think that move_next and move_prev would be best if they simply returned a boolean (or other sentinel value) stating whether the move was successful, simply staying at the last or first element in the map rather than potentially moving into a ghost state.

While I think that the guarantees for the key_mut_unchecked method are poorly documented as others have mentioned, I appreciate the addition of it and personally use it myself. The only issue is having to handle the Option every time like mentioned.

@Amanieu
Copy link
Member Author

Amanieu commented Jun 9, 2023

There are several advantages in having a ghost element:

  • Insertion at a certain position "just works" after a lower_bound/upper_bound lookup, even if the tree is empty.
  • After removing the last element, the cursor must still point to something.
  • move_next/move_prev returning a bool is somewhat redundant given that .get() already returns an option.

With that said, I can see how it can be hard to work with. I'd be happy to try out a better API if you want to sketch one out.

@jeffparsons
Copy link
Contributor

jeffparsons commented Jun 10, 2023

I assume this is a non-starter for performance reasons, but the most "natural" feeling interface for me would be one that can point to all real elements and all points between/next to them (including start/end). There would then be convenience methods for skipping over whatever you're not interested in so you can stay in "element space" or "gap space" if you only care about one or the other most of the time.

But again, I imagine this would make the API too slow because the compiler couldn't optimise away visiting all the gaps — does that sound right?

(Edit: if this doesn't make sense, I'd be happy to sketch what I imagine would be the core API, and what would be the optional convenience layers.)

(Edit2: or you could have separate cursor types for when you're pointing at an element vs a gap, and have the navigation functions consume self and convert between them where relevant. I think that would be on par for performance with the current design?)

@clarfonthey
Copy link
Contributor

Honestly, the kind of modification I was thinking of was to leverage the existing Entry API for mutable cursors, and to simply take ownership of the cursor during certain operations that might cause the entry to become vacant, returning an entry instead of just modifying the occupied entry.

@lf-
Copy link
Contributor

lf- commented Jun 11, 2023

I have been implementing a TCP segment reordering buffer (where you're storing ranges and they wrap around u32::MAX).

So I have been looking into building a wrapping iterator over a BTreeMap and it seems that implementing a wrapping mutating cursor based on this is surprisingly nontrivial: you'd have to ... somehow ... swap between a &mut BTreeMap (to create a new CursorMut when you wrap) and a CursorMut and when you get rid of the CursorMut to replace it, put the &mut BTreeMap back. This is kind of nonobvious how to do when hand rolling a generator as one does while writing these things.

I guess my feedback on this API is that it might need a function to move it back to the start or to a specified element.

@finnbear
Copy link
Contributor

finnbear commented Jun 12, 2023

Can we document the potential hazard that BTreeMap::upper_bound does the opposite of C++'s map::upper_bound?

In other words, when porting C++ to Rust,

  • map::lower_bound -> BTreeMap::lower_bound(Bound::Included(...))
  • map::upper_bound -> BTreeMap::lower_bound(Bound::Excluded(...))

FWIW, I'm strongly in favor of both API's 🚀

@Amanieu
Copy link
Member Author

Amanieu commented Jun 12, 2023

Can we document the potential hazard that BTreeMap::upper_bound does the opposite of C++'s map::upper_bound?

Are you sure about this? They should map the same way.

upper_bound returns the last element below the given bound, which should match what C++'s upper_bound does.

@finnbear
Copy link
Contributor

finnbear commented Jun 12, 2023

Are you sure about this? They should map the same way.

I'm more confused than I am sure, but I still get the impression that there is a hazard when porting C++ to Rust.

C++ upper_bound docs say that it "Returns an iterator pointing to the first element in the container whose key is considered to go after k."

Rust BTreeMap::upper_bound docs say that it "Returns a Cursor pointing at the last element that is below the given bound."

So, here are the results on an example map:

k1 -> v1
k2 -> v2
k3 -> v3
k4 -> v4
k5 -> v5

// C++
.upper_bound("k3") = "v4" // notably the same as .lower_bound(Bound::Excluded("k3")) in Rust
.lower_bound("k3") = "v3"

// Rust
.upper_bound(Bound::Excluded("k3")) = "v2"
.upper_bound(Bound::Included("k3")) = "v3"
.lower_bound(Bound::Excluded("k3")) = "v4"
.lower_bound(Bound::Included("k3")) = "v3"
My code

Online C++

#include 
#include 

int main ()
{
  std::map mymap;
  std::map::iterator itlow,itup;

  mymap[1]=1;
  mymap[2]=2;
  mymap[3]=3;
  mymap[4]=4;
  mymap[5]=5;
  
  itup=mymap.upper_bound (3);
  std::cout << itup->first << ' ' << itup->second << '\n';
  
  itlow=mymap.lower_bound (3);
  std::cout << itlow->first << ' ' << itlow->second << '\n';
  

  return 0;
}

Rust Playground

#![feature(btree_cursors)]

fn main() {
    use std::collections::BTreeMap;
    use std::ops::Bound;
    
    let mut a = BTreeMap::new();
    a.insert("k1", "v1");
    a.insert("k2", "v2");
    a.insert("k3", "v3");
    a.insert("k4", "v4");
    a.insert("k5", "v5");
    let cursor = a.upper_bound(Bound::Excluded(&"k3"));
    println!("upper exc {:?}", cursor.key());
     let cursor = a.upper_bound(Bound::Included(&"k3"));
    println!("upper inc {:?}", cursor.key());
    let cursor = a.lower_bound(Bound::Excluded(&"k3"));
    println!("lower exc {:?}", cursor.key());
     let cursor = a.lower_bound(Bound::Included(&"k3"));
    println!("lower inc {:?}", cursor.key());
}

@Amanieu thoughts?

@Amanieu
Copy link
Member Author

Amanieu commented Jun 13, 2023

I see, you are indeed correct.


Considering the feedback received so far, I am wondering if changing Cursors to represent a point between 2 elements rather than pointing at one element would be better:

  • This avoids the need for a ghost element, since even in an empty tree there exists a point which has no previous or next element.
  • lower_bound/upper_bound naturally end up pointing between 2 elements, which is less confusing.
  • Moving over a value with move_prev/move_next can return an Option<(&K, &mut V)> which results in a more natural Rust API.

Let me know if this sounds promising, and I'll try and sketch out a new API with this design.

@finnbear
Copy link
Contributor

finnbear commented Jun 13, 2023

Let me know if this sounds promising

It's hard to say which is better without some short before & after code examples i.e. to do X, you do Y with the old API and Z with the new API (if you wanted, the examples could also show how to accomplish the same thing in C++ or other languages). For now, it sounds plausible but not necessarily better or worse.

Anyway, after realizing the confusion, the API's were both intuitive and useful for the porting I was doing today and the only thing I'm asking for is for the docs to point out the porting hazard (alternatively, the function names could be changed to avoid matching those used in C++).

@Amanieu
Copy link
Member Author

Amanieu commented Jun 13, 2023

Here's an API sketch for a cursor that resides at a point between 2 elements. Elements are read by "passing over" them with a cursor.

impl<K, V> BTreeMap<K, V> {
    fn lower_bound<Q>(&self, bound: Bound<&Q>) -> Cursor<'_, K, V>
    where
        K: Borrow<Q> + Ord,
        Q: Ord;
    fn lower_bound_mut<Q>(&mut self, bound: Bound<&Q>) -> CursorMut<'_, K, V>
    where
        K: Borrow<Q> + Ord,
        Q: Ord;
    fn upper_bound<Q>(&self, bound: Bound<&Q>) -> Cursor<'_, K, V>
    where
        K: Borrow<Q> + Ord,
        Q: Ord;
    fn upper_bound_mut<Q>(&mut self, bound: Bound<&Q>) -> CursorMut<'_, K, V>
    where
        K: Borrow<Q> + Ord,
        Q: Ord;
}

struct Cursor<'a, K: 'a, V: 'a>;

impl<'a, K, V> Cursor<'a, K, V> {
    fn next(&mut self) -> Option<(&'a K, &'a V)>;
    fn prev(&mut self) -> Option<(&'a K, &'a V)>;

    fn peek_next(&self) -> Option<(&'a K, &'a V)>;
    fn peek_prev(&self) -> Option<(&'a K, &'a V)>;
}

struct CursorMut<'a, K: 'a, V: 'a>;

impl<'a, K, V> CursorMut<'a, K, V> {
    fn next(&mut self) -> Option<(&K, &mut V)>;
    fn prev(&mut self) -> Option<(&K, &mut V)>;

    unsafe fn next_with_mut_key(&mut self) -> Option<(&mut K, &mut V)>;
    unsafe fn prev_with_mut_key(&mut self) -> Option<(&mut K, &mut V)>;

    fn peek_next(&self) -> Option<(&K, &V)>;
    fn peek_prev(&self) -> Option<(&K, &V)>;

    fn peek_next_mut(&mut self) -> Option<(&K, &mut V)>;
    fn peek_prev_mut(&mut self) -> Option<(&K, &mut V)>;

    unsafe fn peek_next_with_mut_key(&mut self) -> Option<(&mut K, &mut V)>;
    unsafe fn peek_prev_with_mut_key(&mut self) -> Option<(&mut K, &mut V)>;

    fn as_cursor(&self) -> Cursor<'_, K, V>;

    // Inserts at current point, cursor is moved to before the inserted element.
    fn insert_after(&mut self, key: K, value: V);
    // Inserts at current point, cursor is moved to after the inserted element.
    fn insert_before(&mut self, key: K, value: V);

    unsafe fn insert_after_unchecked(&mut self, key: K, value: V);
    unsafe fn insert_before_unchecked(&mut self, key: K, value: V);

    fn remove_next(&mut self) -> Option<(K, V)>;
    fn remove_prev(&mut self) -> Option<(K, V)>;
}

@clarfonthey
Copy link
Contributor

clarfonthey commented Jun 14, 2023

Conceptually, I think that I do like this API more, but I still think that the combinatorical explosion of options here could be more easily avoided by integrating properly with the Entry API. Namely, if instead of having the options for the various combinations of whether you're inserting, mutating the key, etc. you could just leverage the existing Entry API with the ability to do key mutation.

Essentially, the position of the cursor between the elements is still unchanged. But rather than making the cursor this all-powerful entity, we give it equal footing with other various elements.

To accomplish this, I propose a particular sleight of hand. We retain the meaning of a "gap" as the position in the map between elements. A vacant entry points to a particular gap, and an occupied entry points to a particular element and its adjacent gaps. When mutating a key for either kind of entry, you must uphold that the entry upholds the Ord invariant for the map. So, you can nudge it around within its entry fine, but you can't leave it.

By leaving the semantics of the key specifically in the entry and requiring the entry for mutation of elements, the semantics of all the various mutable methods becomes more clear. Here's what the conceptual API would look like:

type Position = /* we'll talk about this later */;

impl<'a, K, V> Cursor<'a, K, V> { // and `CursorMut`
    // looks around the cursor
    fn peek_next(&self) -> Option<(&K, &V)>;
    fn peek_prev(&self) -> Option<(&K, &V)>;

    // move the cursor
    fn next(&mut self) -> Position;
    fn prev(&mut self) -> Position;
}
    
impl<'a, K, V> CursorMut<'a, K, V> {
    // attempts to insert at this spot in the cursor
    fn insert(self, key: K) -> Result<VacantEntry<'a, K, V>, CursorMut<'a, K, V>>;
    unsafe fn insert_unchecked(self, key: K) -> VacantEntry<'a, K, V>;

    // mutates around the cursor
    fn enter_next(&mut self) -> Option<OccupiedEntry<'_, K, V>>;
    fn enter_prev(&mut self) -> Option<OccupiedEntry<'_, K, V>>;
}

impl<'a, K, V> OccupiedEntry<'a, K, V> { // and `VacantEntry` and `Entry` too
    // requires upholding invariants mentioned
    unsafe fn key_mut(&mut self) -> &mut K;
}

impl<'a, K, V> VacantEntry<'a, K, V> {
    // if we want the ability to convert back into a cursor
    fn seek_elsewhere(self) -> (K, CursorMut<'a, K, V>);
    fn insert_and_seek_after(self, value: V) -> CursorMut<'a, K, V>;
    fn insert_and_seek_before(self, value: V) -> CursorMut<'a, K, V>;
}

Part of the reason why I think VacantEntry should have mutable access to the key is because it aligns well with the idea of entries being positions in the map associated with owned keys, which may or may not be inserted.

Note the bit I said about Position. I don't think that it's right to return a mutable value when mutating the cursor, mostly because we'd want all mutation to go through entries instead. I personally think it makes the most sense to return Option<&K> just because inspecting the key is really what you care most about, although since we're going to be providing the key anyway, we might as well also provide the value.

That's the bit I'm fuzziest on. But the rest is my basic idea on how to use entries to solve the complexity and the semantics. We don't need any weird insert_after or insert_before because we require converting a cursor into an entry, not simply borrowing it. I list some potential methods we could add to VacantEntry to allow converting back, but those weren't super thought out. My thought process is that the last step someone will probably want to take for the entry API is inserting, since you need to poke around to ensure you're satisfying invariants before you actually do the insert anyway.

@clarfonthey
Copy link
Contributor

A side note on the naming of upper_bound and lower_bound conflicting with C++: why not just abandon that scheme altogether, then? We could go with something like seek_before and seek_after which would be (IMHO) unambiguously clear and is different enough from the status quo to avoid passive mistakes.

@clarfonthey
Copy link
Contributor

While I'm not familiar enough with the cursor code to refactor everything to only point between elements, I did submit #112896 which adds the key_mut API to the entry types. One point I bring up in the PR which I think is worth mentioning here is that this kind of feature might deserve separation from this actual cursor API into its own tracking issue, since IMHO the cursor API and key mutation are concerns that should be separated. Would love to know what the folks following this thread think about this, and also would appreciate any additional feedback on the API I've proposed which integrates more with the entry API.

@SvizelPritula
Copy link

I'd like to also comment on the method names, if I can.

I've participated in several programming competitions in C++, where binary searches and searches on binary trees are common. Every time I've had to perform one, I had to stare at the documentation for several minutes, struggling to figure out what lower_bound and upper_bound does. (Granted, this is partially a fault of the docs.) I always wished C++ had gone with four seperate methods: find_last_less, find_first_greater_or_equal, find_last_less_or_equal and find_first_greater.

Rust's current upper_bound and lower_bound methods are significantly easier to understand and remember, but I also feel it would be easier to understand if they were called seek_last (or maybe find_last?) and seek_first.

@momvart
Copy link
Contributor

momvart commented Feb 10, 2024

Chalk me up as one of the "weirdos" that thought the old API made sense, lol.

I have the same problem. I was also confused by how bounds are interpreted. I think the term "above" is misleading. Finally, after quite a bit of time I came up with the following rules.

let mut a = BTreeMap::new();
a.insert(1, "a");
a.insert(2, "b");
a.insert(3, "c");
a.insert(4, "d");

// a: [1, 2, 3, 4]
// The range of Included(&2) to Unbounded:      [.. , 2 , ..]
//              [2, +∞)                             ---------
let cursor = a.lower_bound(Bound::Included(&2));
// The actual cursor:                           [1 , 2 , 3 , ..]
//                                                 ^
assert_eq!(cursor.peek_prev(), Some((&1, &"a")));
assert_eq!(cursor.peek_next(), Some((&2, &"b")));
// The range of Excluded(&2) to Unbounded:      [.. , 2 , ..]
//              (2, +∞)                                 -----
let cursor = a.lower_bound(Bound::Excluded(&2));
// The actual cursor:                           [1 , 2 , 3 , ..]
//                                                     ^
assert_eq!(cursor.peek_prev(), Some((&2, &"b")));
assert_eq!(cursor.peek_next(), Some((&3, &"c")));
// Conclusion: for lower_bound, the next node of the cursor is within the range.

// The range of Unbounded to Included(&3):      [.. , 3 , ..]
//              (-∞, 3]                         ---------
let cursor = a.upper_bound(Bound::Included(&3));
// The actual cursor:                           [.. , 2 , 3 , 4]
//                                                          ^
assert_eq!(cursor.peek_prev(), Some((&3, &"c")));
assert_eq!(cursor.peek_next(), Some((&4, &"d")));
// The range of Unbounded to Excluded(&3):      [.. , 3 , ..]
//              (-∞, 3)                         -----
let cursor = a.upper_bound(Bound::Excluded(&3));
// The actual cursor:                           [.. , 2 , 3 , 4]
//                                                      ^
assert_eq!(cursor.peek_prev(), Some((&2, &"b")));
assert_eq!(cursor.peek_next(), Some((&3, &"c")));
// Conclusion: for upper_bound, the previous node of the cursor is within the range.

@Amanieu
Copy link
Member Author

Amanieu commented Feb 11, 2024

Feel free to submit a PR if you think you can make the documentation better. We can further refine the wording in that PR.

@ripytide
Copy link
Contributor

ripytide commented Feb 11, 2024

I had a go at improving some of the documentation in #120936

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 12, 2024
improve `btree_cursors` functions documentation

As suggested by `@Amanieu` (and others) in rust-lang#107540 (rust-lang#107540 (comment))

Improvements:
- Document exact behavior of `{upper/lower}_bound{,_mut}` with each of the three `Bound` types using unambigous words `{greatest,greater,smallest,smaller,before,after}`.
- Added another doc-example for the `Bound::Unbounded` for each of the methods
- Changed doc-example to use From<[T; N]> rather than lots of `insert()`s which requires a mutable map which clutters the example when `mut` may not be required for the method (such as for `{upper,lower}_bound`.
- Removed `# Panics` section from `insert_{before,after}` methods since they were changed to return an error instead a while ago.
- Reworded some phrases to be more consistent with the more regular `BTreeMap` methods such as calling entries "key-value" rather than "element"s.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 12, 2024
improve `btree_cursors` functions documentation

As suggested by ``@Amanieu`` (and others) in rust-lang#107540 (rust-lang#107540 (comment))

Improvements:
- Document exact behavior of `{upper/lower}_bound{,_mut}` with each of the three `Bound` types using unambigous words `{greatest,greater,smallest,smaller,before,after}`.
- Added another doc-example for the `Bound::Unbounded` for each of the methods
- Changed doc-example to use From<[T; N]> rather than lots of `insert()`s which requires a mutable map which clutters the example when `mut` may not be required for the method (such as for `{upper,lower}_bound`.
- Removed `# Panics` section from `insert_{before,after}` methods since they were changed to return an error instead a while ago.
- Reworded some phrases to be more consistent with the more regular `BTreeMap` methods such as calling entries "key-value" rather than "element"s.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Feb 12, 2024
Rollup merge of rust-lang#120936 - ripytide:master, r=Amanieu

improve `btree_cursors` functions documentation

As suggested by ``@Amanieu`` (and others) in rust-lang#107540 (rust-lang#107540 (comment))

Improvements:
- Document exact behavior of `{upper/lower}_bound{,_mut}` with each of the three `Bound` types using unambigous words `{greatest,greater,smallest,smaller,before,after}`.
- Added another doc-example for the `Bound::Unbounded` for each of the methods
- Changed doc-example to use From<[T; N]> rather than lots of `insert()`s which requires a mutable map which clutters the example when `mut` may not be required for the method (such as for `{upper,lower}_bound`.
- Removed `# Panics` section from `insert_{before,after}` methods since they were changed to return an error instead a while ago.
- Reworded some phrases to be more consistent with the more regular `BTreeMap` methods such as calling entries "key-value" rather than "element"s.
lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Apr 7, 2024
improve `btree_cursors` functions documentation

As suggested by ``@Amanieu`` (and others) in #107540 (rust-lang/rust#107540 (comment))

Improvements:
- Document exact behavior of `{upper/lower}_bound{,_mut}` with each of the three `Bound` types using unambigous words `{greatest,greater,smallest,smaller,before,after}`.
- Added another doc-example for the `Bound::Unbounded` for each of the methods
- Changed doc-example to use From<[T; N]> rather than lots of `insert()`s which requires a mutable map which clutters the example when `mut` may not be required for the method (such as for `{upper,lower}_bound`.
- Removed `# Panics` section from `insert_{before,after}` methods since they were changed to return an error instead a while ago.
- Reworded some phrases to be more consistent with the more regular `BTreeMap` methods such as calling entries "key-value" rather than "element"s.
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this issue Apr 27, 2024
improve `btree_cursors` functions documentation

As suggested by ``@Amanieu`` (and others) in #107540 (rust-lang/rust#107540 (comment))

Improvements:
- Document exact behavior of `{upper/lower}_bound{,_mut}` with each of the three `Bound` types using unambigous words `{greatest,greater,smallest,smaller,before,after}`.
- Added another doc-example for the `Bound::Unbounded` for each of the methods
- Changed doc-example to use From<[T; N]> rather than lots of `insert()`s which requires a mutable map which clutters the example when `mut` may not be required for the method (such as for `{upper,lower}_bound`.
- Removed `# Panics` section from `insert_{before,after}` methods since they were changed to return an error instead a while ago.
- Reworded some phrases to be more consistent with the more regular `BTreeMap` methods such as calling entries "key-value" rather than "element"s.
tgross35 added a commit to tgross35/rust that referenced this issue Jul 28, 2024
… r=tgross35

Add missing periods on `BTreeMap` cursor `peek_next` docs

Tracking issue: rust-lang#107540
@GrigorenkoPV
Copy link
Contributor

Sorry, I did not read the entire discussion. Could anyone, please, briefly summarize why the decision was made to enable key mutation via a separate CursorMutKey and not via unsafe methods on CursorMut?

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jul 29, 2024
… r=tgross35

Add missing periods on `BTreeMap` cursor `peek_next` docs

Tracking issue: rust-lang#107540
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Jul 29, 2024
Rollup merge of rust-lang#128310 - kmicklas:btree-map-peek-next-docs, r=tgross35

Add missing periods on `BTreeMap` cursor `peek_next` docs

Tracking issue: rust-lang#107540
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 4, 2024
Implement cursors for `BTreeSet`

Tracking issue: rust-lang#107540

This is a straightforward wrapping of the map API, except that map's `CursorMut` does not make sense, because there is no value to mutate. Hence, map's `CursorMutKey` is wrapped here as just `CursorMut`, since it's unambiguous for sets and we don't normally speak of "keys". On the other hand, I can see some potential for confusion with `CursorMut` meaning different things in each module. I'm happy to take suggestions to improve that.

r? `@Amanieu`
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 4, 2024
Implement cursors for `BTreeSet`

Tracking issue: rust-lang#107540

This is a straightforward wrapping of the map API, except that map's `CursorMut` does not make sense, because there is no value to mutate. Hence, map's `CursorMutKey` is wrapped here as just `CursorMut`, since it's unambiguous for sets and we don't normally speak of "keys". On the other hand, I can see some potential for confusion with `CursorMut` meaning different things in each module. I'm happy to take suggestions to improve that.

r? ``@Amanieu``
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 5, 2024
Implement cursors for `BTreeSet`

Tracking issue: rust-lang#107540

This is a straightforward wrapping of the map API, except that map's `CursorMut` does not make sense, because there is no value to mutate. Hence, map's `CursorMutKey` is wrapped here as just `CursorMut`, since it's unambiguous for sets and we don't normally speak of "keys". On the other hand, I can see some potential for confusion with `CursorMut` meaning different things in each module. I'm happy to take suggestions to improve that.

r? ```@Amanieu```
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 5, 2024
Implement cursors for `BTreeSet`

Tracking issue: rust-lang#107540

This is a straightforward wrapping of the map API, except that map's `CursorMut` does not make sense, because there is no value to mutate. Hence, map's `CursorMutKey` is wrapped here as just `CursorMut`, since it's unambiguous for sets and we don't normally speak of "keys". On the other hand, I can see some potential for confusion with `CursorMut` meaning different things in each module. I'm happy to take suggestions to improve that.

r? ````@Amanieu````
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Aug 5, 2024
Rollup merge of rust-lang#128309 - kmicklas:btreeset-cursor, r=Amanieu

Implement cursors for `BTreeSet`

Tracking issue: rust-lang#107540

This is a straightforward wrapping of the map API, except that map's `CursorMut` does not make sense, because there is no value to mutate. Hence, map's `CursorMutKey` is wrapped here as just `CursorMut`, since it's unambiguous for sets and we don't normally speak of "keys". On the other hand, I can see some potential for confusion with `CursorMut` meaning different things in each module. I'm happy to take suggestions to improve that.

r? ````@Amanieu````
@terrorfisch
Copy link

@GrigorenkoPV according to the 8ee9693 commit message:

The ability to mutate keys is also factored out into a separate
CursorMutKey type which is unsafe to create. This makes the API easier
to use since it avoids duplicated versions of each method with and
without key mutation.

Maybe there is more discussion somewhere?

I think one should explore a Key<'a, K>(&'a mut K) type which implements Deref<Target=K> and has a unsafe fn get_mut(&mut self) -> &mut K method.

@AngheloAlf
Copy link

AngheloAlf commented Nov 22, 2024

Could it be possible to add functions to CursorMut that return references that depend on the lifetime of the original BTreeMap?

For example:

impl<'a, K, V> CursorMut<'a, K, V> {
    fn peek_next(&mut self) -> Option<(&K, &mut V)>;
    fn peek_prev(&mut self) -> Option<(&K, &mut V)>;

    // TODO: a less silly name
    fn peek_next_lifetime(&mut self) -> Option<(&'a K, &'a mut V)>;
    fn peek_prev_lifetime(&mut self) -> Option<(&'a K, &'a mut V)>;
}

My use case is to have an interface that allows to add an object to the TreeMap if the key is vacant or to leave it unmodified if it already exists and then return a mutable reference to the object. This is almost doable with the current CursorMut interface, but I ended up needing an unsafe block at the bottom to accomplish this because the lifetime of peek_prev depends on the local cursor object instead of the TreeMap.

As a reference, this is a simplified version of my code:

pub fn add_symbol<'a>(
    map: &'a mut BTreeMap<u32, SymbolMetadata>,
    vram: u32,
) -> &'a mut SymbolMetadata {
    let mut cursor = map.upper_bound_mut(Bound::Included(&vram));

    let must_insert_new = if let Some((sym_vram, sym)) = cursor.peek_prev() {
        if vram == *sym_vram {
            // Symbol already exists
            false
        } else {
            // Create new symbol if new `vram` is outside the range of the last symbol that's lower than `vram`
            vram >= *sym_vram + sym.size()
        }
    } else {
        // Create new if it doesn't exist
        true
    };

    if must_insert_new {
        cursor
            .insert_before(vram, SymbolMetadata::new(vram))
            .expect("This should not be able to panic");
    }

    // Hack to workaround lifetimes issues
    let ptr = cursor.peek_prev().unwrap().1 as *mut SymbolMetadata;
    unsafe { &mut *ptr }
}

@hanna-kruppe
Copy link
Contributor

The proposed signature (with &mut self) allows calling these functions several times while holding on to the result. So you could get multiple aliasing &'a mut V, which makes these functions very unsound. I think taking self by value would fix that problem (and seems sufficient for your use case) but I haven’t thought about it very deeply. With that signature you can’t call both the prev and next variants on the same cursor, which may be annoying for other use cases, so perhaps it should be something like:

fn into_prev_and_next(self) -> (Option<…>, Option<…>)

@AngheloAlf
Copy link

Yeah, you are right, I didn't realize Rust would allow multiple mut references to 'a to live at the same time.

Your into_prev_and_next proposal sounds good. Just to be sure, it would look like this, right?

impl<'a, K, V> CursorMut<'a, K, V> {
    fn into_prev_and_next(self) -> (Option<(&'a K, &'a mut V)>, Option<(&'a K, &'a mut V)>);
}

This is an implementation I made to test the idea, and seems to work fine.

fn into_prev_and_next<'a, K, V>(mut cursor: btree_map::CursorMut<'a, K, V>) -> (Option<(&'a K, &'a mut V)>, Option<(&'a K, &'a mut V)>) {
    let prev: Option<(&'a K, &'a mut V)> = cursor.peek_prev().map(|(k, v)| {
        let ptr_k = k as *const K;
        let ptr_v = v as *mut V;
        unsafe {
            (&*ptr_k, &mut *ptr_v)
        }
    });
    let next: Option<(&'a K, &'a mut V)> = cursor.peek_next().map(|(k, v)| {
        let ptr_k = k as *const K;
        let ptr_v = v as *mut V;
        unsafe {
            (&*ptr_k, &mut *ptr_v)
        }
    });

    (prev, next)
}

I suppose CursorMutKey will want a similar interface too.

@hanna-kruppe
Copy link
Contributor

Yes, that’s the signature I meant. The implementation sketch makes me nervous because it looks like the sort of code that can run afoul of aliasing models (stacked borrows, tree borrows, or future alternatives - try testing it under Miri). But I’m sure that’s just an artifact of implementing it in terms of the existing methods, std can definitely implement it without such problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests