Skip to content

Commit e1204a5

Browse files
authored
Minor: Add docs to EliminateOuterJoins (#4343)
1 parent 561be4f commit e1204a5

File tree

1 file changed

+22
-0
lines changed

1 file changed

+22
-0
lines changed

datafusion/optimizer/src/eliminate_outer_join.rs

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,28 @@ use datafusion_expr::expr::Cast;
2929
use std::sync::Arc;
3030

3131
#[derive(Default)]
32+
///
33+
/// Attempt to replace outer joins with inner joins.
34+
///
35+
/// Outer joins are typically more expensive to compute at runtime
36+
/// than inner joins and prevent various forms fo predicate pushdown
37+
/// and other optimizations, so removing them if possible is beneficial.
38+
///
39+
/// Inner joins filter out rows that do match. Outer joins pass rows
40+
/// that do not match padded with nulls. If there is a filter in the
41+
/// query that would filter any such null rows after the join the rows
42+
/// introduced by the outer join are filtered.
43+
///
44+
/// For example, in the `select ... from a left join b on ... where b.xx = 100;`
45+
///
46+
/// For rows when `b.xx` is null (as it would be after an outer join),
47+
/// the `b.xx = 100` predicate filters them out and there there is no
48+
/// need to produce null rows for output.
49+
///
50+
/// Generally, an outer join can be rewritten to inner join if the
51+
/// filters from the WHERE clause return false while any inputs are
52+
/// null and columns of those quals are come from nullable side of
53+
/// outer join.
3254
pub struct EliminateOuterJoin;
3355

3456
impl EliminateOuterJoin {

0 commit comments

Comments
 (0)