@@ -31,44 +31,52 @@ explained in more detail in the [Query Planning and Execution Overview] section
31
31
DataFusion's [ LogicalPlan] is an enum containing variants representing all the supported operators, and also
32
32
contains an ` Extension ` variant that allows projects building on DataFusion to add custom logical operators.
33
33
34
- It is possible to create logical plans by directly creating instances of the [ LogicalPlan] enum as follows , but is is
34
+ It is possible to create logical plans by directly creating instances of the [ LogicalPlan] enum as shown , but it is
35
35
much easier to use the [ LogicalPlanBuilder] , which is described in the next section.
36
36
37
37
Here is an example of building a logical plan directly:
38
38
39
- <!-- source for this example is in datafusion_docs::library_logical_plan::plan_1 -->
40
-
41
39
``` rust
42
- // create a logical table source
43
- let schema = Schema :: new (vec! [
44
- Field :: new (" id" , DataType :: Int32 , true ),
45
- Field :: new (" name" , DataType :: Utf8 , true ),
46
- ]);
47
- let table_source = LogicalTableSource :: new (SchemaRef :: new (schema ));
48
-
49
- // create a TableScan plan
50
- let projection = None ; // optional projection
51
- let filters = vec! []; // optional filters to push down
52
- let fetch = None ; // optional LIMIT
53
- let table_scan = LogicalPlan :: TableScan (TableScan :: try_new (
54
- " person" ,
55
- Arc :: new (table_source ),
56
- projection ,
57
- filters ,
58
- fetch ,
59
- )? );
60
-
61
- // create a Filter plan that evaluates `id > 500` that wraps the TableScan
62
- let filter_expr = col (" id" ). gt (lit (500 ));
63
- let plan = LogicalPlan :: Filter (Filter :: try_new (filter_expr , Arc :: new (table_scan ))? );
64
-
65
- // print the plan
66
- println! (" {}" , plan . display_indent_schema ());
40
+ use datafusion :: common :: DataFusionError ;
41
+ use datafusion :: arrow :: datatypes :: {DataType , Field , Schema , SchemaRef };
42
+ use datafusion :: logical_expr :: {Filter , LogicalPlan , TableScan , LogicalTableSource };
43
+ use datafusion :: prelude :: * ;
44
+ use std :: sync :: Arc ;
45
+
46
+ fn main () -> Result <(), DataFusionError > {
47
+ // create a logical table source
48
+ let schema = Schema :: new (vec! [
49
+ Field :: new (" id" , DataType :: Int32 , true ),
50
+ Field :: new (" name" , DataType :: Utf8 , true ),
51
+ ]);
52
+ let table_source = LogicalTableSource :: new (SchemaRef :: new (schema ));
53
+
54
+ // create a TableScan plan
55
+ let projection = None ; // optional projection
56
+ let filters = vec! []; // optional filters to push down
57
+ let fetch = None ; // optional LIMIT
58
+ let table_scan = LogicalPlan :: TableScan (TableScan :: try_new (
59
+ " person" ,
60
+ Arc :: new (table_source ),
61
+ projection ,
62
+ filters ,
63
+ fetch ,
64
+ )?
65
+ );
66
+
67
+ // create a Filter plan that evaluates `id > 500` that wraps the TableScan
68
+ let filter_expr = col (" id" ). gt (lit (500 ));
69
+ let plan = LogicalPlan :: Filter (Filter :: try_new (filter_expr , Arc :: new (table_scan )) ? );
70
+
71
+ // print the plan
72
+ println! (" {}" , plan . display_indent_schema ());
73
+ Ok (())
74
+ }
67
75
```
68
76
69
77
This example produces the following plan:
70
78
71
- ```
79
+ ``` text
72
80
Filter: person.id > Int32(500) [id:Int32;N, name:Utf8;N]
73
81
TableScan: person [id:Int32;N, name:Utf8;N]
74
82
```
@@ -78,7 +86,7 @@ Filter: person.id > Int32(500) [id:Int32;N, name:Utf8;N]
78
86
DataFusion logical plans can be created using the [ LogicalPlanBuilder] struct. There is also a [ DataFrame] API which is
79
87
a higher-level API that delegates to [ LogicalPlanBuilder] .
80
88
81
- The following associated functions can be used to create a new builder:
89
+ There are several functions that can can be used to create a new builder, such as
82
90
83
91
- ` empty ` - create an empty plan with no fields
84
92
- ` values ` - create a plan from a set of literal values
@@ -102,41 +110,107 @@ The following example demonstrates building the same simple query plan as the pr
102
110
<!-- source for this example is in datafusion_docs::library_logical_plan::plan_builder_1 -->
103
111
104
112
``` rust
105
- // create a logical table source
106
- let schema = Schema :: new (vec! [
107
- Field :: new (" id" , DataType :: Int32 , true ),
108
- Field :: new (" name" , DataType :: Utf8 , true ),
109
- ]);
110
- let table_source = LogicalTableSource :: new (SchemaRef :: new (schema ));
111
-
112
- // optional projection
113
- let projection = None ;
114
-
115
- // create a LogicalPlanBuilder for a table scan
116
- let builder = LogicalPlanBuilder :: scan (" person" , Arc :: new (table_source ), projection )? ;
117
-
118
- // perform a filter operation and build the plan
119
- let plan = builder
120
- . filter (col (" id" ). gt (lit (500 )))? // WHERE id > 500
121
- . build ()? ;
122
-
123
- // print the plan
124
- println! (" {}" , plan . display_indent_schema ());
113
+ use datafusion :: common :: DataFusionError ;
114
+ use datafusion :: arrow :: datatypes :: {DataType , Field , Schema , SchemaRef };
115
+ use datafusion :: logical_expr :: {LogicalPlanBuilder , LogicalTableSource };
116
+ use datafusion :: prelude :: * ;
117
+ use std :: sync :: Arc ;
118
+
119
+ fn main () -> Result <(), DataFusionError > {
120
+ // create a logical table source
121
+ let schema = Schema :: new (vec! [
122
+ Field :: new (" id" , DataType :: Int32 , true ),
123
+ Field :: new (" name" , DataType :: Utf8 , true ),
124
+ ]);
125
+ let table_source = LogicalTableSource :: new (SchemaRef :: new (schema ));
126
+
127
+ // optional projection
128
+ let projection = None ;
129
+
130
+ // create a LogicalPlanBuilder for a table scan
131
+ let builder = LogicalPlanBuilder :: scan (" person" , Arc :: new (table_source ), projection )? ;
132
+
133
+ // perform a filter operation and build the plan
134
+ let plan = builder
135
+ . filter (col (" id" ). gt (lit (500 )))? // WHERE id > 500
136
+ . build ()? ;
137
+
138
+ // print the plan
139
+ println! (" {}" , plan . display_indent_schema ());
140
+ Ok (())
141
+ }
125
142
```
126
143
127
144
This example produces the following plan:
128
145
129
- ```
146
+ ``` text
130
147
Filter: person.id > Int32(500) [id:Int32;N, name:Utf8;N]
131
148
TableScan: person [id:Int32;N, name:Utf8;N]
132
149
```
133
150
151
+ ## Translating Logical Plan to Physical Plan
152
+
153
+ Logical plans can not be directly executed. They must be "compiled" into an
154
+ [ ` ExecutionPlan ` ] , which is often referred to as a "physical plan".
155
+
156
+ Compared to ` LogicalPlan ` s ` ExecutionPlans ` have many more details such as
157
+ specific algorithms and detailed optimizations compared to. Given a
158
+ ` LogicalPlan ` the easiest way to create an ` ExecutionPlan ` is using
159
+ [ ` SessionState::create_physical_plan ` ] as shown below
160
+
161
+ ``` rust
162
+ use datafusion :: datasource :: {provider_as_source, MemTable };
163
+ use datafusion :: common :: DataFusionError ;
164
+ use datafusion :: physical_plan :: display :: DisplayableExecutionPlan ;
165
+ use datafusion :: arrow :: datatypes :: {DataType , Field , Schema , SchemaRef };
166
+ use datafusion :: logical_expr :: {LogicalPlanBuilder , LogicalTableSource };
167
+ use datafusion :: prelude :: * ;
168
+ use std :: sync :: Arc ;
169
+
170
+ // Creating physical plans may access remote catalogs and data sources
171
+ // thus it must be run with an async runtime.
172
+ #[tokio:: main]
173
+ async fn main () -> Result <(), DataFusionError > {
174
+
175
+ // create a default table source
176
+ let schema = Schema :: new (vec! [
177
+ Field :: new (" id" , DataType :: Int32 , true ),
178
+ Field :: new (" name" , DataType :: Utf8 , true ),
179
+ ]);
180
+ // To create an ExecutionPlan we must provide an actual
181
+ // TableProvider. For this example, we don't provide any data
182
+ // but in production code, this would have `RecordBatch`es with
183
+ // in memory data
184
+ let table_provider = Arc :: new (MemTable :: try_new (Arc :: new (schema ), vec! [])? );
185
+ // Use the provider_as_source function to convert the TableProvider to a table source
186
+ let table_source = provider_as_source (table_provider );
187
+
188
+ // create a LogicalPlanBuilder for a table scan without projection or filters
189
+ let logical_plan = LogicalPlanBuilder :: scan (" person" , table_source , None )? . build ()? ;
190
+
191
+ // Now create the physical plan by calling `create_physical_plan`
192
+ let ctx = SessionContext :: new ();
193
+ let physical_plan = ctx . state (). create_physical_plan (& logical_plan ). await ? ;
194
+
195
+ // print the plan
196
+ println! (" {}" , DisplayableExecutionPlan :: new (physical_plan . as_ref ()). indent (true ));
197
+ Ok (())
198
+ }
199
+ ```
200
+
201
+ This example produces the following physical plan:
202
+
203
+ ``` text
204
+ MemoryExec: partitions=0, partition_sizes=[]
205
+ ```
206
+
134
207
## Table Sources
135
208
136
- The previous example used a [ LogicalTableSource] , which is used for tests and documentation in DataFusion, and is also
137
- suitable if you are using DataFusion to build logical plans but do not use DataFusion's physical planner. However, if you
138
- want to use a [ TableSource] that can be executed in DataFusion then you will need to use [ DefaultTableSource] , which is a
139
- wrapper for a [ TableProvider] .
209
+ The previous examples use a [ LogicalTableSource] , which is used for tests and documentation in DataFusion, and is also
210
+ suitable if you are using DataFusion to build logical plans but do not use DataFusion's physical planner.
211
+
212
+ However, it is more common to use a [ TableProvider] . To get a [ TableSource] from a
213
+ [ TableProvider] , use [ provider_as_source] or [ DefaultTableSource] .
140
214
141
215
[ query planning and execution overview ] : https://docs.rs/datafusion/latest/datafusion/index.html#query-planning-and-execution-overview
142
216
[ architecture guide ] : https://docs.rs/datafusion/latest/datafusion/index.html#architecture
@@ -145,5 +219,8 @@ wrapper for a [TableProvider].
145
219
[ dataframe ] : using-the-dataframe-api.md
146
220
[ logicaltablesource ] : https://docs.rs/datafusion-expr/latest/datafusion_expr/logical_plan/builder/struct.LogicalTableSource.html
147
221
[ defaulttablesource ] : https://docs.rs/datafusion/latest/datafusion/datasource/default_table_source/struct.DefaultTableSource.html
222
+ [ provider_as_source ] : https://docs.rs/datafusion/latest/datafusion/datasource/default_table_source/fn.provider_as_source.html
148
223
[ tableprovider ] : https://docs.rs/datafusion/latest/datafusion/datasource/provider/trait.TableProvider.html
149
224
[ tablesource ] : https://docs.rs/datafusion-expr/latest/datafusion_expr/trait.TableSource.html
225
+ [ `executionplan` ] : https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html
226
+ [ `sessionstate::create_physical_plan` ] : https://docs.rs/datafusion/latest/datafusion/execution/session_state/struct.SessionState.html#method.create_physical_plan
0 commit comments