Skip to content

Implement LogicalPlanBuilder::from for Arc<LogicalPlan> #10465

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alamb opened this issue May 12, 2024 · 5 comments · Fixed by #10466 or #10496
Closed

Implement LogicalPlanBuilder::from for Arc<LogicalPlan> #10465

alamb opened this issue May 12, 2024 · 5 comments · Fixed by #10466 or #10496
Labels
good first issue Good for newcomers optimizer Optimizer rules

Comments

@alamb
Copy link
Contributor

alamb commented May 12, 2024

For historical reasons children in LogicalPlan are Arc<LogicalPlan> (so shared, immutable references)

For many optimizer optimizations we need a owned LogicalPlan

This can be achieved via unwrap_arc

However, as pointed out by #10460 (comment) @ClSlaid it is not obvious at first

Today to get a builder from a plan you need to know this pattern:

// the builder accepts LogicalPlan only, so get one via unwrap_arc
let mut builder = LogicalPlanBuilder::from(unwrap_arc(plan));

I think it would be great to support this pattern:

// the builder accepts Arc<LogicalPlan> directly
let mut builder = LogicalPlanBuilder::from(plan);

And the underlying impl From<Arc<LogicalPlan>> for LogicalPlanBuilder would call unwrap_arc

@alamb alamb added optimizer Optimizer rules good first issue Good for newcomers labels May 12, 2024
@alamb alamb changed the title Ah, I think you are asking about getting LogicalPlan from Arc<LogicalPlan> Implement LogicalPlanBuilder::from for Arc<LogicalPlan> May 12, 2024
@ClSlaid
Copy link
Contributor

ClSlaid commented May 12, 2024

I suggest not call unwrap_arc until the built LogicalPlan really requires the ownership of a LogicalPlan.

In most occasions, LogicalPlan contains an input: Arc<LogicalPlan>, and sometimes we will have to construct LogicalPlanBuilder from it. This will lead to a clone, or (99 percent likely) avoiding that with unwrap_arc.

Meanwhile, most of the LogicalPlans constructed by the builder place the plan in Arcs again. This then causes another clone.

There must be a way to seamlessly moving the input: Arc<LogicalPlan> into rewritten plans, and it would be no clone in most occasions, and at most one clone be committed.

@alamb
Copy link
Contributor Author

alamb commented May 12, 2024

That is an interesting point @ClSlaid

So perhaps you are suggesting LogicalPlanBuilder has an Arc<LogicalPlan> as its internal state (rather than a LogicalPlan)

@ClSlaid
Copy link
Contributor

ClSlaid commented May 13, 2024

So perhaps you are suggesting LogicalPlanBuilder has an Arc<LogicalPlan> as its internal state (rather than a LogicalPlan)

This is surely the most straight forward way.

@alamb
Copy link
Contributor Author

alamb commented May 13, 2024

Great idea @ClSlaid

Thanks to @AbrarNitk we have a first version of LogicalPlanBuilder::from(arc_input) in #10466 🙏

I think we should merge that PR and then make a second PR (and maybe a second ticket) about switching the internal representation from

#[derive(Debug, Clone)]
pub struct LogicalPlanBuilder {
plan: LogicalPlan,
}

To

 pub struct LogicalPlanBuilder { 
     plan: Arc<LogicalPlan>, 
 } 

@alamb
Copy link
Contributor Author

alamb commented May 13, 2024

Filed #10485 to track the internal representation change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers optimizer Optimizer rules
Projects
None yet
2 participants