Skip to content

implement the full set of sorts on QueryManyIter #13443

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Dec 3, 2024

Conversation

Victoronz
Copy link
Contributor

@Victoronz Victoronz commented May 20, 2024

Objective

Blocked on #13417

Motivation is the same as in #13417. If users can sort QueryIter, to only makes sense to also allow them to use this functionality on QueryManyIter.

Solution

Also implement the sorts on QueryManyIter.

The implementation of the sorts themselves are mostly the same as with QueryIter in #13417.
They differ in that they re-use the entity_iter passed to the iter_many, and internally call iter_many_unchecked_manual on the lens QueryState with it.
These methods also return a different struct, QuerySortedManyIter, because there is no longer a guarantee of unique entities.
QuerySortedManyIter implements the various Iterator traits for read-only iteration, as QueryManyIter does + DoubleEndedIterator.
For mutable iteration, there is both a fetch_next and a fetch_next_back method. However, they only become available after the user calls collect_inner on QuerySortedManyIter first. This collects the inner entity_iter (this is the sorted one, not the original the user passed) to drop all query lens items to avoid aliasing.
When TAITs are available this collect_inner could be hidden away, until then it is unfortunately not possible to elide this without either regressing read-only iteration, or introducing a whole new type, mostly being a copy of QuerySortedIter.

As a follow-up we could add a entities_all_unique method to check whether the entity list consists of only unique entities, and then return a QuerySortedIter from it (under opaque impl Trait if need be), allowing mutable Iterator trait iteration over what was originally an iter_many call.
Such a method can also be added to QueryManyIter, albeit needing a separate, new return type.

Testing

I've switched the third example/doc test under sort out for one that shows the collect_inner/fetch_next_back functionality, otherwise the examples are the same as in #13417, adjusted to use iter_many instead of iter.

The query-iter-many-sorts test checks for equivalence to the underlying sorts.
The test after shows that these sorts do not panic after fetch/fetch_next calls.

Changelog

Added sort, sort_unstable, sort_by, sort_unstable_by, sort_by_key, sort_by_cached_key to QueryManyIter.
Added QuerySortedManyIter.

@Victoronz
Copy link
Contributor Author

Victoronz commented May 20, 2024

This currently doesn't include the #13417 commit (maybe it should?), aside from the NeutralOrd helper, they are independent.

@Victoronz Victoronz force-pushed the query-iter-many-sort branch from 274e38d to e7cdf21 Compare May 20, 2024 18:48
@alice-i-cecile alice-i-cecile added A-ECS Entities, components, systems, and events S-Blocked This cannot move forward until something else changes C-Usability A targeted quality-of-life change that makes Bevy easier to use labels May 21, 2024
@Victoronz Victoronz force-pushed the query-iter-many-sort branch 4 times, most recently from 0db4abb to 64d8c7e Compare May 21, 2024 21:33
@Victoronz Victoronz force-pushed the query-iter-many-sort branch from 64d8c7e to d653f87 Compare May 21, 2024 21:44
@Victoronz Victoronz marked this pull request as ready for review May 21, 2024 21:55
@alice-i-cecile alice-i-cecile added S-Needs-Review Needs reviewer attention (from anyone!) to move forward and removed S-Blocked This cannot move forward until something else changes labels May 22, 2024
Copy link
Member

@alice-i-cecile alice-i-cecile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should exist, the docs are excellent, the engineering strategy is solid and well-justified and there's a good set of tests. Nice work.

@alice-i-cecile alice-i-cecile added X-Uncontroversial This work is generally agreed upon D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes D-Unsafe Touches with unsafe code in some way labels May 22, 2024
@Victoronz
Copy link
Contributor Author

Victoronz commented Jun 28, 2024

made the same addition as #14040

Copy link
Contributor

@chescock chescock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! I wrote some musings on possible ways to simplify things, but none of them should block this PR.

impl<'w, 's, D: QueryData, F: QueryFilter> QuerySortedManyIter<'w, 's, D, F, IntoIter<Entity>> {
/// Get next result from the query
/// [`collect_inner`](QuerySortedManyIter) needs to be called before this method becomes available.
/// This is done to prevent mutable aliasing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure I understand this correctly: The aliasing issue here is that D::Item may take mutable access to an entity while the L::Item from the lens still has shared access through another copy of the same Entity. And the only reason the L::Item is still alive is that the iter::Map is holding a vec::IntoIter<(Entity, L::Item)> with the items it hasn't yielded yet. And all it's going to do with the L::Item is drop it.

That's very subtle! I feel like it could use a comment somewhere explaining it, but I don't know what to write or where to put it.

Hmm... would it be possible to avoid the need for collect_inner() if you somehow drop the L::Item earlier? Like, suppose you use a Vec<(Entity, ManuallyDrop<L::Item>)> and manually drop them before the map() call. Would that be sound even without collect_inner()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me rephrase for clarity:
In an iteration with no duplicates (QuerySortedIter) this presents no issue because the offending reference is always dropped before D::Item is retrieved.
However with duplicates, a mutable reference to D::Item could alias with a not yet dropped L::Item later on in the iteration, which means all L::Items need to be dropped before we allow mutable iteration via fetch_next.

ManuallyDrop does not make sense here, since you either want to drop nothing in advance (lazy map) or everything (eager map).
The problem here is that iterators are lazy, and especially when sorting, you may just want the first or last item.
Paying for an eager map of the entire query seemed wasteful, so I implemented collect_inner.
This problem stems from mutable and immutable iteration returning the same struct.

Maybe we could actually have two impl blocks for QueryManyIter, one for mutable QueryData, and one for immutable. It would be nice if you could have sort methods with the same name on both, returning an eagerly mapped or lazily mapped iterator respectively.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, it sounds like I understand the aliasing issue!

Regarding eager vs. lazy iteration: Note that you still have to drop all of the L::Items even if you just take the first or last item. It just happens implicitly when you drop the Map<IntoIter> without fully iterating it.

(And, of course, most actual L::Item types like &T and Mut<T> have no drop glue. So we're doing nothing, and the question is when we're doing nothing :).)

The expensive thing we want to avoid in the immutable case is having to allocate a fresh Vec<Entity> and copy everything over. And it would be even better if we could avoid it in the mutable case, too!

I think something like this works:

let mut keyed_query: Vec<_> = query_lens
    .map(|(key, entity)| (ManuallyDrop::new(key), NeutralOrd(entity)))
    .collect();
keyed_query.sort();
for (item, _) in &mut keyed_query {
    // SAFETY: `keyed_query` is immediately consumed by the `map`, which ignores the item,
    // so the value is never exposed to safe code or dropped again.
    unsafe { ManuallyDrop::drop(item) };
}
let entity_iter = keyed_query.into_iter().map(|(.., entity)| entity.0);

We're eagerly dropping the lens items, but we have to drop those eventually, so that's moving work around rather than adding anything. And for the common case where there is no drop glue, the compiler can eliminate the entire loop.

There is no extra collect(), so there is no extra memory allocation or copying.

And the lens items have all been dropped, so there are no longer aliasing concerns and we can offer fetch_next to mutable queries!

Copy link
Contributor Author

@Victoronz Victoronz Nov 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will compile/is safe.
See this example: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=df1f5c8daf4f12206f4e954701efc3b7

Calling the destructor of a reference will not end its borrow, since Drop::drop/drop glue doesn't deallocate the value itself!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does compile! But further testing with miri shows that it isn't sound :(. I assumed that values wrapped in ManuallyDrop no longer need to be valid, but it seems that they do. (But maybe that will change?)

It does work if we use MaybeUninit, though! We can drop its contents early, and rust doesn't require that they remain valid since they may be uninitialized. It's more awkward to use, and doesn't impl Ord, but I got something that works and seems to make miri happy!

I don't have permission to make a PR to this PR, so I'll push up the code as a PR in my own repo: chescock#6

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I am still skeptical? The difference between mem::drop and drop_in_place/Drop::drop is that only mem::drop discards/deallocates the value. For the other two, a reference will still be valid after having been dropped! (It is Copy, so has no drop glue).
As such, the for loop with MaybeUninit::assume_init_drop does not actually do anything.
Whether this is sound or not I honestly don't know, and even if miri doesn't catch something doesn't mean it is sound.
If it were to be, how does the generated code change? By using a wrapper that opts out of validity like this, we might lose out on optimizations just as much vs if we had just mem::dropped them all with a collect

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like we'll get it figured out in a comment thread here, and it's tricky enough that it definitely needs more eyes on it, so I'll plan to push it up as a separate PR once this one is merged. I'm hoping we agree that if we can find a way to do this soundly, that it would be a nice improvement to the API! :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to open that PR after! If it can be done soundly and is better than just a plain collect, it'd be nice!

where
L::Item<'w>: Ord,
{
let world = self.world;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of duplication between these methods, just as there is with the original sort() methods. The only difference is in the particular sort method called, and whether you wrap the entity in NeutralOrd. Would it make sense to try to share the implementation by making a common method that takes a FnOnce(&mut Vec<...>) (and always wraps in NeutralOrd)?

Maybe it's easier to explain if I create a PR that does that for the existing sort methods ... okay, #16203.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had tried, but ran into strange lifetime issues, so I punted on it. I also wanted to preserve when Entity was wrapped in NeutralOrd.

@@ -1841,7 +2650,7 @@ impl<'w, 's, D: QueryData, F: QueryFilter> QueryIterationCursor<'w, 's, D, F> {
}

// NOTE: If you are changing query iteration code, remember to update the following places, where relevant:
// QueryIter, QueryIterationCursor, QuerySortedIter, QueryManyIter, QueryCombinationIter, QueryState::par_fold_init_unchecked_manual
// QueryIter, QueryIterationCursor, QuerySortedIter, QueryManyIter, QuerySortedManyIter, QueryCombinationIter, QueryState::par_fold_init_unchecked_manual
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of very similar structs here: QuerySortedIter, QueryManyIter, and QuerySortedManyIter, and #13477 would add another one, QueryManyUniqueIter. Let me check that I understand the reason for each one.

The "sorted" ones can assume that the entities exist, because we looked them up in the world when querying the lens (QSI, QSMI), and the others can't because they were passed in (QMI, QMUI). If they are guaranteed to exist, then we can impl ExactSizeIterator and we can do entities.get(entity).unwrap() in fetch_next and avoid the need for a loop. We can also forget the QueryFilter, because it was already evaluated.

Some of them can assume the entities are unique because we got them from a query (QSI) or we checked for duplicates (QMUI) and the others can't (QMI, QSMI). If there may be duplicate entities, then it's not sound to perform mutable iteration. QueryManyIter offers mutable streaming iteration through a fn fetch_next(&mut self) method.

Finally, the sorted ones allow the entity_iter to hold live copies of L::Item, which is why QuerySortedManyIter can't even do mutable streaming iteration without collect_inner.

Is that right?

It might be good to have some comments highlighting the differences in these otherwise-similar structs!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any ways we could remove some of the duplication? Both to reduce the risk that the implementations get out of sync, and to reduce the size of the public API.

Maybe the sorted ones could be combined with the others by adding a type parameter that indicates whether the entities were already queried. Something like QueryIndirectIter<'w, 's, D, F, I, const FILTERED: bool>, where you only impl ExactSizeIterator for QII<D, F, I, true> and where you have an if FILTERED { unreachable_unchecked() } in the fetch_next method.

Maybe QuerySortedIter could be changed to be a wrapper over QuerySortedManyIter, similar to how #13477 makes QueryManyUniqueIter a wrapper over QueryManyIter. (I don't think you can combine those with a type parameter because you want to impl Iterator if D: ReadOnlyQueryData OR they are unique, and those would be overlapping impls.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You got it! More documentation should definitely be made. Currently, the docs for Query do not mention sorting, which they really should as well.

Yeah, there are a lot of iterator types, however it is normal in the rust ecosystem to have different structs for different forms of iteration, just like every adapter on iterator returns its own struct. (That bevy has combined mutable and immutable into a single type is already unusual)

However I do not think combining them is a good idea. That would not reduce code duplication by much, since different forms of iteration will still need their own impl blocks regardless.
Quite the opposite, I would find it more confusing both from a dev, and the user standpoint.
Since picking the "wrong kind" of iteration is quite common, and iterators like to nest, it is very important to be as clear as we can inside errors. (Errors unpack type aliases)
Besides, future iterator types that actually do just do the same kind of iteration can be type aliases in clear manner:
pub type QuerySortedManyUniqueIter<...> = QuerySortedIter<...>;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern with duplication is mainly the various fetch_next methods, which duplicate the logic of Query::get. QueryManyIter::fetch_next_aliased_unchecked, QuerySortedIter::fetch_next, and QuerySortedManyIter::fetch_next_aliased_unchecked are almost identical, and it's hard to see what the differences are if you want to make changes to the logic.

If we want to have multiple public types, then maybe we can have a single inner type with the complicated fetch_next implementation, and have the other types wrap that with appropriate safety assertions.

But we can always refactor once this and #13477 are both merged, so I don't think we need to worry about it yet!

Copy link
Contributor Author

@Victoronz Victoronz Nov 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several properties about their entities query iterators can rely on/guarantee:

Iterator Uniqueness Existence QD/QF eval* dense EL/A order**
QueryManyIter Unknown Unknown Unknown Unknown
QueryManyUniqueIter Known Unknown Unknown Unknown
QuerySortedManyIter Unknown Known Known Depends
QuerySortedManyUniqueIter Known Known Known Depends
QuerySortedIter Known Known Known Depends
QueryIter Known Known Depends Known
QueryParIter Known Known Depends n/a?
QueryParManyIter Unknown Unknown Unknown n/a?
QueryParManyUniqueIter Known Unknown Unknown n/a?
QueryCombinationsIter n/a Known Depends n/a
QuerySortedCombinationsIter n/a Known Known n/a
QueryCombinationsManyIter n/a Unknown Unknown n/a
QuerySortedCombinationsManyIter n/a Known Known n/a

*whether QueryData/QueryFilter matches (QueryFilter::IS_ARCHETYPAL is relevant here)
**dense EntityLocation/Archetype order

Take this as a rough sketch, it is my current understanding of sensible iterator types, given current/potential API.
It is not an exhaustive or definite list, I am also least familiar with combination/parallel iterators)
Each of these properties can change what fetch_next/its equivalent can do/skip, which makes it really difficult to try and deduplicate without introducing unnecessary complexity or perf hits. I'm not gonna say it is impossible, maybe there is even a good way to intertwine these, but in the meantime, documenting these properties would likely make more sense imo.

@iiYese iiYese removed their request for review November 2, 2024 22:14
@alice-i-cecile alice-i-cecile added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Nov 4, 2024
@alice-i-cecile alice-i-cecile added this to the 0.16 milestone Nov 4, 2024
@alice-i-cecile alice-i-cecile added this pull request to the merge queue Dec 3, 2024
Merged via the queue into bevyengine:main with commit aa600ae Dec 3, 2024
29 checks passed
ecoskey pushed a commit to ecoskey/bevy that referenced this pull request Jan 6, 2025
# Objective

~Blocked on bevyengine#13417~

Motivation is the same as in bevyengine#13417. If users can sort `QueryIter`, to
only makes sense to also allow them to use this functionality on
`QueryManyIter`.

## Solution

Also implement the sorts on `QueryManyIter`. 

The implementation of the sorts themselves are mostly the same as with
`QueryIter` in bevyengine#13417.
They differ in that they re-use the `entity_iter` passed to the
`iter_many`, and internally call `iter_many_unchecked_manual` on the
lens `QueryState` with it.
These methods also return a different struct, `QuerySortedManyIter`,
because there is no longer a guarantee of unique entities.
`QuerySortedManyIter` implements the various `Iterator` traits for
read-only iteration, as `QueryManyIter` does + `DoubleEndedIterator`.
For mutable iteration, there is both a `fetch_next` and a
`fetch_next_back` method. However, they only become available after the
user calls `collect_inner` on `QuerySortedManyIter` first. This collects
the inner `entity_iter` (this is the sorted one, **not** the original
the user passed) to drop all query lens items to avoid aliasing.
When TAITs are available this `collect_inner` could be hidden away,
until then it is unfortunately not possible to elide this without either
regressing read-only iteration, or introducing a whole new type, mostly
being a copy of `QuerySortedIter`.

As a follow-up we could add a `entities_all_unique` method to check
whether the entity list consists of only unique entities, and then
return a `QuerySortedIter` from it (under opaque impl Trait if need be),
*allowing mutable `Iterator` trait iteration* over what was originally
an `iter_many` call.
Such a method can also be added to `QueryManyIter`, albeit needing a
separate, new return type.

## Testing

I've switched the third example/doc test under `sort` out for one that
shows the collect_inner/fetch_next_back functionality, otherwise the
examples are the same as in bevyengine#13417, adjusted to use `iter_many` instead
of `iter`.

The `query-iter-many-sorts` test checks for equivalence to the
underlying sorts.
The test after shows that these sorts *do not* panic after
`fetch`/`fetch_next` calls.

## Changelog

Added `sort`, `sort_unstable`, `sort_by`, `sort_unstable_by`,
`sort_by_key`, `sort_by_cached_key` to `QueryManyIter`.
Added `QuerySortedManyIter`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes D-Unsafe Touches with unsafe code in some way S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it X-Uncontroversial This work is generally agreed upon
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants