You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/contributor-guide/roadmap.md
+81Lines changed: 81 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -43,3 +43,84 @@ start a conversation using a github issue or the
43
43
make review efficient and avoid surprises.
44
44
45
45
[The current list of `EPIC`s can be found here](https://github.com/apache/datafusion/issues?q=is%3Aissue+is%3Aopen+epic).
46
+
47
+
# Quarterly Roadmap
48
+
49
+
A quarterly roadmap will be published to give the DataFusion community
50
+
visibility into the priorities of the projects contributors. This roadmap is not
51
+
binding and we would welcome any/all contributions to help keep this list up to
52
+
date.
53
+
54
+
## 2023 Q4
55
+
56
+
- Improve data output (`COPY`, `INSERT` and DataFrame) output capability [#6569](https://github.com/apache/datafusion/issues/6569)
57
+
- Implementation of `ARRAY` types and related functions [#6980](https://github.com/apache/datafusion/issues/6980)
58
+
- Write an industrial paper about DataFusion for SIGMOD [#6782](https://github.com/apache/datafusion/issues/6782)
59
+
60
+
## 2022 Q2
61
+
62
+
### DataFusion Core
63
+
64
+
- IO Improvements
65
+
- Reading, registering, and writing more file formats from both DataFrame API and SQL
66
+
- Additional options for IO including partitioning and metadata support
67
+
- Work Scheduling
68
+
- Improve predictability, observability and performance of IO and CPU-bound work
69
+
- Develop a more explicit story for managing parallelism during plan execution
70
+
- Memory Management
71
+
- Add more operators for memory limited execution
72
+
- Performance
73
+
- Incorporate row-format into operators such as aggregate
74
+
- Add row-format benchmarks
75
+
- Explore JIT-compiling complex expressions
76
+
- Explore LLVM for JIT, with inline Rust functions as the primary goal
77
+
- Improve performance of Sort and Merge using Row Format / JIT expressions
78
+
- Documentation
79
+
- General improvements to DataFusion website
80
+
- Publish design documents
81
+
- Streaming
82
+
- Create `StreamProvider` trait
83
+
84
+
### Ballista
85
+
86
+
- Make production ready
87
+
- Shuffle file cleanup
88
+
- Fill functional gaps between DataFusion and Ballista
89
+
- Improve task scheduling and data exchange efficiency
90
+
- Better error handling
91
+
- Task failure
92
+
- Executor lost
93
+
- Schedule restart
94
+
- Improve monitoring and logging
95
+
- Auto scaling support
96
+
- Support for multi-scheduler deployments. Initially for resiliency and fault tolerance but ultimately to support sharding for scalability and more efficient caching.
97
+
- Executor deployment grouping based on resource allocation
0 commit comments