Skip to content

.*: Box all Futures that are larger than 8KiB #23283

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

ParkMyCar
Copy link
Member

This PR boxes (aka moves to the heap) all Rust Futures that are larger than 8KiB, and enables the large_futures clippy lint.

Motivation

Calling Rust Futures currently can result in inefficient code with excessive memcpy's, as documented by this issue: rust-lang/rust#99504. There are a number of Futures in our codebase which are quite large >8KiB, and thus could see poor performance. By Box-ing the Futures we'd end up memcpy-ing only a fat pointer, which is 16 bytes, which should be a good performance improvement.

Discussed this in Slack

Tips for reviewer

This PR is split into two commits that can be reviewed separately:

  1. Enabling the new large_futures lint.
  2. Boxing all of the Futures.

When boxing a Future there are two possible approaches, either changing the async fn to fn -> BoxFuture or using .boxed() at the callsite. I generally used .boxed() and only when a function was called multiple times, didn't use generics, and it didn't complicate the return type too much, would I update from async fn to fn -> BoxFuture.

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • This PR includes the following user-facing behavior changes:

@ParkMyCar ParkMyCar requested a review from benesch as a code owner November 17, 2023 18:24
@ParkMyCar ParkMyCar requested a review from a team November 17, 2023 18:24
@ParkMyCar ParkMyCar requested a review from a team as a code owner November 17, 2023 18:24
@ParkMyCar ParkMyCar requested a review from a team November 17, 2023 18:24
@ParkMyCar ParkMyCar requested a review from a team as a code owner November 17, 2023 18:24
@ParkMyCar ParkMyCar requested a review from a team November 17, 2023 18:24
@ParkMyCar ParkMyCar requested review from a team, aalexandrov and umanwizard as code owners November 17, 2023 18:24
@def- def- requested review from def- and removed request for a team November 17, 2023 18:29
Copy link
Contributor

@def- def- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I triggered nightly and coverage would appreciate waiting for the results.

(I didn't expect the Rust issue to have been opened by a former SAP colleague of mine, hi Ivan!)

Copy link
Contributor

@jkosh44 jkosh44 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surfaces changes LGTM. I personally don't understand the difference between boxed() and boxed_local(), it might be useful to add that to the PR description.

I kicked off a scalability benchmark: https://buildkite.com/materialize/nightlies/builds/5283, I thought the results might be interesting.

Comment on lines +179 to +180
# Futures get compiled into state machines, which can end up being very large (10's of KB).
# It's inefficient to moves these around on the stack, so require that they get boxed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful to include a link to rust-lang/rust#99504 here.

@@ -529,36 +529,40 @@ pub struct AuthedClient {
}

impl AuthedClient {
async fn new(
adapter_client: &Client,
/// Note: the returned Future is intentionally boxed because it is very large.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is subjective, so it's ultimately up to you, but I think these comments are a little redundant. As long as the lint has a sufficient explanation somewhere around it, I think we can remove these comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1! I also find these comments more noisy than helpful

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree! I assume people that wonder about this would try to remove the boxed call and find out the reason for it through the lint.

@@ -689,17 +690,30 @@ where
/// The `Since` error indicates that the requested `as_of` cannot be served
/// (the caller has out of date information) and includes the smallest
/// `as_of` that would have been accepted.
#[instrument(level = "debug", skip_all, fields(shard = %self.machine.shard_id()))]
pub async fn listen(self, as_of: Antichain<T>) -> Result<Listen<K, V, T, D>, Since<T>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit sad that we're changing signatures here... in particular it seems brittle if this method is refactored.

IIUC it would be ~nearly as good to insert a .boxed() before the .await and leave the signature alone... is that right, and if so how would you feel about it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely can revert the change to the signature!

The only reason I originally made the changes instead of using .boxed() was because certain functions were called numerous times, and it seemed easier to change the signature rather than all of the callsites.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah +1 on this too, I dislike baking this into signatures. As long as the lint catches it if the boxed call is missed at a callsite, then I have a decently strong preference to do that everywhere in the persist bits of this change.

@ParkMyCar ParkMyCar marked this pull request as draft November 17, 2023 19:24
@ParkMyCar
Copy link
Member Author

Moving this to the draft stage while I work out if it actually impacts performance or not, given the size of the PR!

@ParkMyCar
Copy link
Member Author

ParkMyCar commented Nov 17, 2023

I kicked off a scalability benchmark: https://buildkite.com/materialize/nightlies/builds/5283, I thought the results might be interesting.

Thanks for kicking off the scalability benchmark! If anything were to show an improvement I think that would be the one

Copy link
Contributor

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat find!

@@ -529,36 +529,40 @@ pub struct AuthedClient {
}

impl AuthedClient {
async fn new(
adapter_client: &Client,
/// Note: the returned Future is intentionally boxed because it is very large.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1! I also find these comments more noisy than helpful

@@ -689,17 +690,30 @@ where
/// The `Since` error indicates that the requested `as_of` cannot be served
/// (the caller has out of date information) and includes the smallest
/// `as_of` that would have been accepted.
#[instrument(level = "debug", skip_all, fields(shard = %self.machine.shard_id()))]
pub async fn listen(self, as_of: Antichain<T>) -> Result<Listen<K, V, T, D>, Since<T>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah +1 on this too, I dislike baking this into signatures. As long as the lint catches it if the boxed call is missed at a callsite, then I have a decently strong preference to do that everywhere in the persist bits of this change.

Copy link
Contributor

@teskje teskje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compute changes LGTM.

@@ -73,6 +73,7 @@
#![warn(clippy::disallowed_macros)]
#![warn(clippy::disallowed_types)]
#![warn(clippy::from_over_into)]
#![warn(clippy::large_futures)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very exited about moving these to the workspace Cargo.toml and getting out of the business of having to touch every crate when we adjust a lint :)

@@ -529,36 +529,40 @@ pub struct AuthedClient {
}

impl AuthedClient {
async fn new(
adapter_client: &Client,
/// Note: the returned Future is intentionally boxed because it is very large.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree! I assume people that wonder about this would try to remove the boxed call and find out the reason for it through the lint.

Copy link
Contributor

@guswynn guswynn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

storage stuff looks good!

@jkosh44 boxed_local just doesn't require the boxed future is Send!

Some(RehydrationCommand::Send(command)) => {
self.absorb_command(&command);
/// Note: the returned Future is intentionally boxed because it is very large.
fn step_await_address<'a>(&'a mut self) -> BoxFuture<'a, RehydrationTaskState<T>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im surprised this one is so big!

@petrosagg
Copy link
Contributor

I tried to interpret the scalability benchmark results and this didn't seem to move the needle much. I'm starting to lean against merging this lint as it is applying a premature optimization to many places that don't really benefit by it. An analogy I thought is that we also don't always bother to reuse a vector or use with_capacity in functions that are rarely called because that improves readability and the extra ceremony to reuse the allocation would be a net negative. I believe we would also be against a lint that forced you to be careful with allocations and we would rather only do it when it makes sense.

So all in all I think we should punt on this and box futures if we discover that we spend a lot of time copying memory in some critical part of the code.

@ParkMyCar
Copy link
Member Author

Closing this because we're currently unsure of the actual impact

@ParkMyCar ParkMyCar closed this Jan 8, 2024
@ParkMyCar ParkMyCar deleted the lint/large-futures branch June 17, 2024 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants