-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite the BTreeMap cursor API using gaps #118208
Conversation
Tracking issue: rust-lang#107540 Currently, a `Cursor` points to a single element in the tree, and allows moving to the next or previous element while mutating the tree. However this was found to be confusing and hard to use. This PR completely refactors cursors to instead point to a gap between two elements in the tree. This eliminates the need for a "ghost" element that exists after the last element and before the first one. Additionally, `upper_bound` and `lower_bound` now have a much clearer meaning. The ability to mutate keys is also factored out into a separate `CursorMutKey` type which is unsafe to create. This makes the API easier to use since it avoids duplicated versions of each method with and without key mutation. API summary: ```rust impl<K, V> BTreeMap<K, V> { fn lower_bound<Q>(&self, bound: Bound<&Q>) -> Cursor<'_, K, V> where K: Borrow<Q> + Ord, Q: Ord; fn lower_bound_mut<Q>(&mut self, bound: Bound<&Q>) -> CursorMut<'_, K, V> where K: Borrow<Q> + Ord, Q: Ord; fn upper_bound<Q>(&self, bound: Bound<&Q>) -> Cursor<'_, K, V> where K: Borrow<Q> + Ord, Q: Ord; fn upper_bound_mut<Q>(&mut self, bound: Bound<&Q>) -> CursorMut<'_, K, V> where K: Borrow<Q> + Ord, Q: Ord; } struct Cursor<'a, K: 'a, V: 'a>; impl<'a, K, V> Cursor<'a, K, V> { fn next(&mut self) -> Option<(&'a K, &'a V)>; fn prev(&mut self) -> Option<(&'a K, &'a V)>; fn peek_next(&self) -> Option<(&'a K, &'a V)>; fn peek_prev(&self) -> Option<(&'a K, &'a V)>; } struct CursorMut<'a, K: 'a, V: 'a>; impl<'a, K, V> CursorMut<'a, K, V> { fn next(&mut self) -> Option<(&K, &mut V)>; fn prev(&mut self) -> Option<(&K, &mut V)>; fn peek_next(&mut self) -> Option<(&K, &mut V)>; fn peek_prev(&mut self) -> Option<(&K, &mut V)>; unsafe fn insert_after_unchecked(&mut self, key: K, value: V); unsafe fn insert_before_unchecked(&mut self, key: K, value: V); fn insert_after(&mut self, key: K, value: V); fn insert_before(&mut self, key: K, value: V); fn remove_next(&mut self) -> Option<(K, V)>; fn remove_prev(&mut self) -> Option<(K, V)>; fn as_cursor(&self) -> Cursor<'_, K, V>; unsafe fn with_mutable_key(self) -> CursorMutKey<'a, K, V, A>; } struct CursorMutKey<'a, K: 'a, V: 'a>; impl<'a, K, V> CursorMut<'a, K, V> { fn next(&mut self) -> Option<(&mut K, &mut V)>; fn prev(&mut self) -> Option<(&mut K, &mut V)>; fn peek_next(&mut self) -> Option<(&mut K, &mut V)>; fn peek_prev(&mut self) -> Option<(&mut K, &mut V)>; unsafe fn insert_after_unchecked(&mut self, key: K, value: V); unsafe fn insert_before_unchecked(&mut self, key: K, value: V); fn insert_after(&mut self, key: K, value: V); fn insert_before(&mut self, key: K, value: V); fn remove_next(&mut self) -> Option<(K, V)>; fn remove_prev(&mut self) -> Option<(K, V)>; fn as_cursor(&self) -> Cursor<'_, K, V>; unsafe fn with_mutable_key(self) -> CursorMutKey<'a, K, V, A>; } ```
(rustbot has picked a reviewer for you, use r? to override) |
Co-authored-by: Joe ST <[email protected]>
should |
/// # Safety | ||
/// | ||
/// Since this cursor allows mutating keys, you must ensure that the `BTreeMap` | ||
/// invariants are maintained. Specifically: | ||
/// | ||
/// This returns `None` if the cursor is currently pointing to the | ||
/// "ghost" non-element. | ||
/// * The key of the newly inserted element must be unique in the tree. | ||
/// * All keys in the tree must remain in sorted order. | ||
#[unstable(feature = "btree_cursors", issue = "107540")] | ||
pub fn value_mut(&mut self) -> Option<&mut V> { | ||
self.current.as_mut().map(|current| current.kv_mut().1) | ||
pub unsafe fn with_mutable_key(self) -> CursorMutKey<'a, K, V, A> { | ||
self.inner | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm skeptical that maintaining BTreeMap
invariants is safety critical because, if it was, then the following safe code would be unsound:
struct Key;
impl Ord for Key {
fn cmp(&self, other: &Self) -> Ordering {
[Ordering::Less, Ordering::Equal, Ordering::Greater].choose(&mut thread_rng())
}
}
for _ in 0..3 {
map.insert(Key, ());
}
And, from the docs, we have:
It is a logic error for a key to be modified in such a way that the key’s ordering relative to any other key, as determined by the Ord trait, changes while it is in the map...The behavior resulting from such a logic error is not specified, but will be encapsulated to the BTreeMap that observed the logic error and not result in undefined behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second, I don't see mutable
being spelled out in any ident in std
. It's always abbreviated to mut
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want that unsafe code can trust the order of BTreeMap<usize, _>
. If you can get a &mut reference to a key you can swap it with a different value even if the type (like usize) is entirely well-behaved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok good to know! My point about the name still stands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in the tracking issue, I much prefer leveraging the entry API and adding key mutation there (see: #112896) rather than adding an entire other API like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
@@ -2738,22 +2740,21 @@ impl<K, V> Clone for Cursor<'_, K, V> { | |||
#[unstable(feature = "btree_cursors", issue = "107540")] | |||
impl<K: Debug, V: Debug> Debug for Cursor<'_, K, V> { | |||
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { | |||
f.debug_tuple("Cursor").field(&self.key_value()).finish() | |||
f.write_str("Cursor") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a requirement to merge, but it would be nice if this also detailed where the cursor was relative to the elements in the map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe by including the value from either side of the cursor, if any?
Cursor {
prev: Some("m"),
next: Some("n"),
}
#[unstable(feature = "btree_cursors", issue = "107540")] | ||
impl<K: Debug, V: Debug, A> Debug for CursorMut<'_, K, V, A> { | ||
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { | ||
f.write_str("CursorMut") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto here -- it could probably just reuse the same code.
@@ -3248,5 +3238,114 @@ impl<'a, K: Ord, V, A: Allocator + Clone> CursorMut<'a, K, V, A> { | |||
} | |||
} | |||
|
|||
impl<'a, K: Ord, V, A: Allocator + Clone> CursorMut<'a, K, V, A> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned on the tracking issue, I would much rather use the entry API with key mutation instead of coming up with an entirely different API mirrored on the cursor. Methods like these could potentially just return VacantEntry
or OccupiedEntry
to do insertion and mutation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Methods like these
I'm confused, which functions are you referring to?
I think it would be less confusing for both the entry API and the Cursor API to both have the core methods like mutating/changing entries rather than the Cursor API returning VacantEntry
/OccupiedEntry
which seems like it would make things quite messy as it adds an extra level of indirection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a downside to methods that consume self
and return a new type: it makes it much harder to write a type which wraps a CursorMut
. The methods on the wrapper type would take &mut self
but you can't easily take ownership of the cursor if it is in a struct field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair, although we could always just mutably borrow in these cases; the only reason why entries take values by self is because they actually contain values that are consumed. Since the lifetime is still tracked in these cases, we don't have to worry about any aliasing issues.
Would it not be a better idea to rather than creating a special "unsafe" cursor to simply have unlimited |
The concept of this looks great to me. Rerolling on the basis of available review time: r? libs-api |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this design based on gaps. Well done.
I have thoughts about some of the small details of the API that I will write up in the tracking issue and/or create my own followup PRs for consideration. Let's go ahead and land this PR though, as it is a huge step in the right direction.
@bors r+ |
Rewrite the BTreeMap cursor API using gaps Tracking issue: rust-lang#107540 Currently, a `Cursor` points to a single element in the tree, and allows moving to the next or previous element while mutating the tree. However this was found to be confusing and hard to use. This PR completely refactors cursors to instead point to a gap between two elements in the tree. This eliminates the need for a "ghost" element that exists after the last element and before the first one. Additionally, `upper_bound` and `lower_bound` now have a much clearer meaning. The ability to mutate keys is also factored out into a separate `CursorMutKey` type which is unsafe to create. This makes the API easier to use since it avoids duplicated versions of each method with and without key mutation. API summary: ```rust impl<K, V> BTreeMap<K, V> { fn lower_bound<Q>(&self, bound: Bound<&Q>) -> Cursor<'_, K, V> where K: Borrow<Q> + Ord, Q: Ord; fn lower_bound_mut<Q>(&mut self, bound: Bound<&Q>) -> CursorMut<'_, K, V> where K: Borrow<Q> + Ord, Q: Ord; fn upper_bound<Q>(&self, bound: Bound<&Q>) -> Cursor<'_, K, V> where K: Borrow<Q> + Ord, Q: Ord; fn upper_bound_mut<Q>(&mut self, bound: Bound<&Q>) -> CursorMut<'_, K, V> where K: Borrow<Q> + Ord, Q: Ord; } struct Cursor<'a, K: 'a, V: 'a>; impl<'a, K, V> Cursor<'a, K, V> { fn next(&mut self) -> Option<(&'a K, &'a V)>; fn prev(&mut self) -> Option<(&'a K, &'a V)>; fn peek_next(&self) -> Option<(&'a K, &'a V)>; fn peek_prev(&self) -> Option<(&'a K, &'a V)>; } struct CursorMut<'a, K: 'a, V: 'a>; impl<'a, K, V> CursorMut<'a, K, V> { fn next(&mut self) -> Option<(&K, &mut V)>; fn prev(&mut self) -> Option<(&K, &mut V)>; fn peek_next(&mut self) -> Option<(&K, &mut V)>; fn peek_prev(&mut self) -> Option<(&K, &mut V)>; unsafe fn insert_after_unchecked(&mut self, key: K, value: V); unsafe fn insert_before_unchecked(&mut self, key: K, value: V); fn insert_after(&mut self, key: K, value: V) -> Result<(), UnorderedKeyError>; fn insert_before(&mut self, key: K, value: V) -> Result<(), UnorderedKeyError>; fn remove_next(&mut self) -> Option<(K, V)>; fn remove_prev(&mut self) -> Option<(K, V)>; fn as_cursor(&self) -> Cursor<'_, K, V>; unsafe fn with_mutable_key(self) -> CursorMutKey<'a, K, V, A>; } struct CursorMutKey<'a, K: 'a, V: 'a>; impl<'a, K, V> CursorMut<'a, K, V> { fn next(&mut self) -> Option<(&mut K, &mut V)>; fn prev(&mut self) -> Option<(&mut K, &mut V)>; fn peek_next(&mut self) -> Option<(&mut K, &mut V)>; fn peek_prev(&mut self) -> Option<(&mut K, &mut V)>; unsafe fn insert_after_unchecked(&mut self, key: K, value: V); unsafe fn insert_before_unchecked(&mut self, key: K, value: V); fn insert_after(&mut self, key: K, value: V) -> Result<(), UnorderedKeyError>; fn insert_before(&mut self, key: K, value: V) -> Result<(), UnorderedKeyError>; fn remove_next(&mut self) -> Option<(K, V)>; fn remove_prev(&mut self) -> Option<(K, V)>; fn as_cursor(&self) -> Cursor<'_, K, V>; unsafe fn with_mutable_key(self) -> CursorMutKey<'a, K, V, A>; } struct UnorderedKeyError; ```
…iaskrgr Rollup of 7 pull requests Successful merges: - rust-lang#118208 (Rewrite the BTreeMap cursor API using gaps) - rust-lang#120099 (linker: Refactor library linking methods in `trait Linker`) - rust-lang#120165 (Switch `NonZero` alias direction.) - rust-lang#120288 (Bump `askama` version) - rust-lang#120306 (Clean up after clone3 removal from pidfd code (docs and tests)) - rust-lang#120316 (Don't call `walk_` functions directly if there is an equivalent `visit_` method) - rust-lang#120330 (Remove coroutine info when building coroutine drop body) r? `@ghost` `@rustbot` modify labels: rollup
…iaskrgr Rollup of 8 pull requests Successful merges: - rust-lang#118208 (Rewrite the BTreeMap cursor API using gaps) - rust-lang#120099 (linker: Refactor library linking methods in `trait Linker`) - rust-lang#120288 (Bump `askama` version) - rust-lang#120306 (Clean up after clone3 removal from pidfd code (docs and tests)) - rust-lang#120316 (Don't call `walk_` functions directly if there is an equivalent `visit_` method) - rust-lang#120330 (Remove coroutine info when building coroutine drop body) - rust-lang#120332 (Remove unused struct) - rust-lang#120338 (Fix links to [strict|exposed] provenance sections of `[std|core]::ptr`) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of rust-lang#118208 - Amanieu:btree_cursor2, r=dtolnay Rewrite the BTreeMap cursor API using gaps Tracking issue: rust-lang#107540 Currently, a `Cursor` points to a single element in the tree, and allows moving to the next or previous element while mutating the tree. However this was found to be confusing and hard to use. This PR completely refactors cursors to instead point to a gap between two elements in the tree. This eliminates the need for a "ghost" element that exists after the last element and before the first one. Additionally, `upper_bound` and `lower_bound` now have a much clearer meaning. The ability to mutate keys is also factored out into a separate `CursorMutKey` type which is unsafe to create. This makes the API easier to use since it avoids duplicated versions of each method with and without key mutation. API summary: ```rust impl<K, V> BTreeMap<K, V> { fn lower_bound<Q>(&self, bound: Bound<&Q>) -> Cursor<'_, K, V> where K: Borrow<Q> + Ord, Q: Ord; fn lower_bound_mut<Q>(&mut self, bound: Bound<&Q>) -> CursorMut<'_, K, V> where K: Borrow<Q> + Ord, Q: Ord; fn upper_bound<Q>(&self, bound: Bound<&Q>) -> Cursor<'_, K, V> where K: Borrow<Q> + Ord, Q: Ord; fn upper_bound_mut<Q>(&mut self, bound: Bound<&Q>) -> CursorMut<'_, K, V> where K: Borrow<Q> + Ord, Q: Ord; } struct Cursor<'a, K: 'a, V: 'a>; impl<'a, K, V> Cursor<'a, K, V> { fn next(&mut self) -> Option<(&'a K, &'a V)>; fn prev(&mut self) -> Option<(&'a K, &'a V)>; fn peek_next(&self) -> Option<(&'a K, &'a V)>; fn peek_prev(&self) -> Option<(&'a K, &'a V)>; } struct CursorMut<'a, K: 'a, V: 'a>; impl<'a, K, V> CursorMut<'a, K, V> { fn next(&mut self) -> Option<(&K, &mut V)>; fn prev(&mut self) -> Option<(&K, &mut V)>; fn peek_next(&mut self) -> Option<(&K, &mut V)>; fn peek_prev(&mut self) -> Option<(&K, &mut V)>; unsafe fn insert_after_unchecked(&mut self, key: K, value: V); unsafe fn insert_before_unchecked(&mut self, key: K, value: V); fn insert_after(&mut self, key: K, value: V) -> Result<(), UnorderedKeyError>; fn insert_before(&mut self, key: K, value: V) -> Result<(), UnorderedKeyError>; fn remove_next(&mut self) -> Option<(K, V)>; fn remove_prev(&mut self) -> Option<(K, V)>; fn as_cursor(&self) -> Cursor<'_, K, V>; unsafe fn with_mutable_key(self) -> CursorMutKey<'a, K, V, A>; } struct CursorMutKey<'a, K: 'a, V: 'a>; impl<'a, K, V> CursorMut<'a, K, V> { fn next(&mut self) -> Option<(&mut K, &mut V)>; fn prev(&mut self) -> Option<(&mut K, &mut V)>; fn peek_next(&mut self) -> Option<(&mut K, &mut V)>; fn peek_prev(&mut self) -> Option<(&mut K, &mut V)>; unsafe fn insert_after_unchecked(&mut self, key: K, value: V); unsafe fn insert_before_unchecked(&mut self, key: K, value: V); fn insert_after(&mut self, key: K, value: V) -> Result<(), UnorderedKeyError>; fn insert_before(&mut self, key: K, value: V) -> Result<(), UnorderedKeyError>; fn remove_next(&mut self) -> Option<(K, V)>; fn remove_prev(&mut self) -> Option<(K, V)>; fn as_cursor(&self) -> Cursor<'_, K, V>; unsafe fn with_mutable_key(self) -> CursorMutKey<'a, K, V, A>; } struct UnorderedKeyError; ```
Tracking issue: #107540
Currently, a
Cursor
points to a single element in the tree, and allows moving to the next or previous element while mutating the tree. However this was found to be confusing and hard to use.This PR completely refactors cursors to instead point to a gap between two elements in the tree. This eliminates the need for a "ghost" element that exists after the last element and before the first one. Additionally,
upper_bound
andlower_bound
now have a much clearer meaning.The ability to mutate keys is also factored out into a separate
CursorMutKey
type which is unsafe to create. This makes the API easier to use since it avoids duplicated versions of each method with and without key mutation.API summary: