-
Notifications
You must be signed in to change notification settings - Fork 13.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-37543][table-planner] Support sink reuse in batch mode #26379
base: master
Are you sure you want to change the base?
Conversation
.booleanType() | ||
.defaultValue(false) | ||
.withDescription( | ||
"When it is true, the optimizer will try to find out duplicated table sinks and " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: find out duplicated -> find duplicate
.withDescription( | ||
"When it is true, the optimizer will try to find out duplicated table sinks and " | ||
+ "reuse them. This works only when " | ||
+ TABLE_OPTIMIZER_REUSE_SUB_PLAN_ENABLED.key() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious when would you have TABLE_OPTIMIZER_REUSE_SUB_PLAN_ENABLED set and not want to reuse duplicates? i.e. is it feasible for us to extend what TABLE_OPTIMIZER_REUSE_SUB_PLAN_ENABLED means rather than introduce a new flag?
import org.apache.flink.table.connector.sink.DynamicTableSink; | ||
|
||
/** | ||
* Interface for {@link DynamicTableSink}s that support target column writing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: support -> supports
* Interface for {@link DynamicTableSink}s that support target column writing. | ||
* | ||
* <p>The planner will parse target columns from the DML clause and call {@link | ||
* #applyTargetColumns(int[][])} to pass an array of column index paths to sink. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: to sink -> to the sink
|
||
/** | ||
* This checks each sink node to see if it can be reused with another sink node. If so, we will | ||
* reuse all reusable sink to one instance. This only used in the STATEMENT SET clause with multiple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: sink -> sinks
I do not know what you mean by to one instance
in this sentence. Maybe this sentence could be
If we find we have duplicate sinks that can be reused, then only one of these sink will be used. This is an optimization, so that we do not need to process multiple sinks that are actually representing the same destination table.
RelTraitSet currentInputTraitSet = currentSinkNode.getInput().getTraitSet(); | ||
for (ReusableSinkGroup group : reusableSinkGroups) { | ||
// Only table sink with the same digest, specs and input trait set can be reused | ||
if (!(group.digest.equals(currentSinkDigest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nicer code to have the if as a positive if and then have an else to avoid the continue.
Also I wonder if this change could be moved into a method in ReusableSinkGroup
class
break; | ||
} | ||
|
||
if (!canBeReused) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this if mean? It is saying that if we cannot Reuse then add to a resuableSinkGroup. This does not look right. I was expecting if (canBeReused) {
return digest.toString(); | ||
} | ||
|
||
private static class ReusableSinkGroup { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this static?
What is the purpose of the change
Support sink reuse in batch mode
Brief change log
Verifying this change
This change is covered by tests, such as (please describe tests).
BatchSinkReuseTest
SinkReuseITCase
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation