-
Notifications
You must be signed in to change notification settings - Fork 11
Blog: Using Ordering for Better Plans in Apache DataFusion #58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I will review this later today |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great @akurmustafa -- thank you.
I have some high level feedback below. Let me know if it makes sense or if you would like me to help implementing some of it
Introduction / Setting
I think the blog would benefit from a few more examples and some more background / introductory content that connects the algorithm described in this post to something users experience. As it is currently written, I think it would be hard for people who do not already have a good understanding of the concepts here to understand the rest of the post.
I left some specific comments
Formatting
I rendered this locally using the instructions here: https://github.com/apache/datafusion-site?tab=readme-ov-file#setup-for-docker
Some of the formatting looks like it isn't quite working as intended:
Thanks @alamb for your feedback as always. I will try to adress those points. Please feel free to add additional feedback as we revise through this work. I am happy to address them to improve readability and flow of the document. |
Hi @alamb, I have addressed the points you mention. I also changed the order of some sections to make the post more clear. I am happy to address reviews by the community. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - good content for the DF blog
Giving it another read now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much @akurmustafa -- this is great! I went through the text quite carefully and have several suggestions, though in my opinion none are required.
If you choose to make any I will be happy to go through and do a final proof reading round as well.
FYI @wiedld and @Omega359 as I think both of you have looked at the code that implements the algorithm described in this post
Co-authored-by: Andrew Lamb <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
Thank you so much @akurmustafa and @ozankabak -- I pushed two commits with minor changes (spelling/code formatting) and then to update the date to today. I think this is ready to publish! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor grammatical updates, one typo.
Co-authored-by: Bruce Ritchie <[email protected]>
Thanks again @akurmustafa @Omega359 and @ozankabak -- I think this one looks really nice |
It is posted to the site now: https://datafusion.apache.org/blog/2025/03/11/ordering-analysis/ |
As per the discussion. I am moving some of my previous blog posts to the
Datafusion
website. Please feel free to suggest improvements, clarification points.This is a blog post explaining the inner working of the ordering requirement analysis in the
Datafusion
. Actual rule and analysis is more involved than described in the blog post. However, I think the detail in the blog post is good enough for most use cases.