Skip to content

RFC: get collection keys #1175

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
121 changes: 121 additions & 0 deletions text/0000-get-collection-keys.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
- Feature Name: get_collection_keys
- Start Date: 2015-06-24
- RFC PR: https://github.com/rust-lang/rust/pull/26531
- Rust Issue:

# Summary

Get the original keys back from collections.

# Motivation

Removing a key from a HashMap or an item from a HashSet will drop the item.
This is usually acceptable for small clonable objects like numbers and strings,
but they are not always clonable, and even if they are it might be an expensive
operation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be massively fleshed out. I would argue that there are two major motivators:

  • Sets
  • Entries

Entries would primarily like to be able to reflect on the key to complete the construction of the value. Potentially also to decide whether to really do the insertion, though I don't expect this much or at all.

Sets on the other hand would like to be able to talk about the key at all. In particular you might use a Set as a cache by indexing on only a small fragment of the key (e.g. an integer id). For technical reasons it may be infeasible or just ineffecient to split the key into a proper (key, value) pair.

The rest of Maps are for the most part, I think, just a nice-to-have. Largely the extended Map API is just needed to implement the new Set API on top of (since Sets are generally trivial Map wrappers).


# Detailed design

There are currently four operations that could have a new function to them:
`get`, `get_mut`, `insert` and `remove`. All of those methods receive a reference
to a key type and return reference to the value or a the value itself.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally voice RFCs as "do this thing" and not "we could do this thing".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this sentence is troubled. Might I suggest "None of these methods provide access to the key" or something.


New analogous methods can be added to return not just the value but both the key
and the value. For `insert` and `remove` they can return the key and the value,
while `get` and `get_mut` will return a reference to them. Notice that `get_mut`
will return a mutable value but not a mutable key.

In the case of HashSet, the same concept applies for the items as if they were
keys in a hash map. It does not make sense to implement something like `get_mut`
in this case.

Structs that should implement these new methods are:

* std::collections::BTreeMap
* std::collections::HashMap
* std::collections::BTreeSet
* std::collections::HashSet


Some structs might implement these new methods, but their benefit besides
consistency is unclear since it would only apply for `usize`:

* std::collections::VecMap
* std::collections::BitSet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These types are slated for deprecation. Regardless, this RFC is proposing a standard map/set API, and as such all maps/sets should implement them unless infeasible.


New API methods:

```rust
impl<K: Ord, V> BTreeMap<K, V> {
// ...

pub fn keyed_get<Q: ?Sized>(&self, key: &Q) -> Option<(&K, &V)> where K: Borrow<Q>, Q: Ord;
pub fn keyed_get_mut<Q: ?Sized>(&mut self, key: &Q) -> Option<(&K, &mut V)> where K: Borrow<Q>, Q: Ord;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably you can yield &mut K for the hypothetical desire to mutate things in the key that don't affect hash/eq/ord. This of course allows you to mess up hash/eq/ord but the collection can't possibly care since that can already be true thanks to:

  • Interior mutability
  • Bad implementations of hash/eq/ord

The collection is already locked up yielding the &mut V, so it's not an ownership problem. In fact the &K is also mutably borrowing the map because of how lifetimes work. Obviously mutating a key in place is super sketchy, and I'm not clear that anyone can reasonably suggest they want this. Still... you could.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm in favor of allowing key mutation, but if we were to expose &mut K, how would that affect IterMut types? For backwards compatibility, we would be unable to yield (&mut K, &mut V) without introducing a new mode of iteration, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. &mut K was honestly more of a musing; I ultimately probably wouldn't go for it, if only as a lint for "you shouldn't be mutating this".

pub fn keyed_insert(&mut self, mut key: K, mut value: V) -> Option<(K, V)>;
pub fn keyed_remove<Q: ?Sized>(&mut self, key: &Q) -> Option<(K, V)> where K: Borrow<Q>, Q: Ord;
}

impl<T: Ord> BTreeSet<T> {
// ...

pub fn get<Q: ?Sized>(&self, value: &Q) -> Option<&T> where T: Borrow<Q>, Q: Ord;
pub fn insert_item(&mut self, value: T) -> Option<T>;
pub fn remove_item<Q: ?Sized>(&mut self, value: &Q) -> Option<T> where T: Borrow<Q>, Q: Ord;
}

impl<K, V, S> HashMap<K, V, S>
where K: Eq + Hash, S: HashState
{
// ...

pub fn keyed_get<Q: ?Sized>(&self, k: &Q) -> Option<(&K, &V)>
where K: Borrow<Q>, Q: Hash + Eq;
pub fn get_mut<Q: ?Sized>(&mut self, k: &Q) -> Option<(&K, &mut V)>;
pub fn keyed_insert(&mut self, k: K, v: V) -> Option<(K, V)>;
pub fn keyed_remove<Q: ?Sized>(&mut self, k: &Q) -> Option<(K, V)>;
}

impl<T, S> HashSet<T, S>
where T: Eq + Hash, S: HashState {
// ...

pub fn get<Q: ?Sized>(&self, value: &Q) -> Option<&T>
where T: Borrow<Q>, Q: Hash + Eq;
pub fn remove_item<Q: ?Sized>(&mut self, value: &Q) -> Option<T>
where T: Borrow<Q>, Q: Hash + Eq;
pub fn insert_item(&mut self, value: T) -> Option<T>;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd nix the Hash* instances. Describe the generic API.

```

The existing `OccupiedEntry` could benefit from these new methods and provide
its own pretty API on top:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either propose this for certain or add it as an alternative or unresolved question.


```rust

impl<'a, K, V> OccupiedEntry<'a, K, V> {
fn key(&self) -> &K;
fn keyed_remove(self) -> (K, V);
fn keyed_get_mut(&mut self) -> (&K, &mut V);
fn keyed_get(&self) -> (&K, &V);
fn keyed_into_mut(self) -> (&'a K, &'a mut V);
}

```

# Drawbacks

Add more code that will be used only sporadically.

# Alternatives

Keep the current design, since keys are immutables once in the collection use
an Rc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't solve the Set usecase.


Add methods to only fetch the keys, but most of the users will need both or
can easily discard the value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit strange...


# Unresolved questions

Naming. `Entry` is already taken for a similar purpose. Having the same name
for sets and maps is desirable, but not required. `keyed` and `item` might be
a good fit.