Skip to content

Blog: Using Ordering for Better Plans in Apache DataFusion #58

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Mar 12, 2025

Conversation

akurmustafa
Copy link
Contributor

As per the discussion. I am moving some of my previous blog posts to the Datafusion website. Please feel free to suggest improvements, clarification points.

This is a blog post explaining the inner working of the ordering requirement analysis in the Datafusion. Actual rule and analysis is more involved than described in the blog post. However, I think the detail in the blog post is good enough for most use cases.

@alamb
Copy link
Contributor

alamb commented Mar 6, 2025

I will review this later today

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great @akurmustafa -- thank you.

I have some high level feedback below. Let me know if it makes sense or if you would like me to help implementing some of it

Introduction / Setting

I think the blog would benefit from a few more examples and some more background / introductory content that connects the algorithm described in this post to something users experience. As it is currently written, I think it would be hard for people who do not already have a good understanding of the concepts here to understand the rest of the post.

I left some specific comments

Formatting

I rendered this locally using the instructions here: https://github.com/apache/datafusion-site?tab=readme-ov-file#setup-for-docker

Some of the formatting looks like it isn't quite working as intended:

Screenshot 2025-03-06 at 12 17 42 PM

@akurmustafa
Copy link
Contributor Author

Thanks @alamb for your feedback as always. I will try to adress those points. Please feel free to add additional feedback as we revise through this work. I am happy to address them to improve readability and flow of the document.

@akurmustafa
Copy link
Contributor Author

Hi @alamb, I have addressed the points you mention. I also changed the order of some sections to make the post more clear. I am happy to address reviews by the community.

Copy link
Contributor

@ozankabak ozankabak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - good content for the DF blog

@alamb
Copy link
Contributor

alamb commented Mar 10, 2025

Giving it another read now

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much @akurmustafa -- this is great! I went through the text quite carefully and have several suggestions, though in my opinion none are required.

If you choose to make any I will be happy to go through and do a final proof reading round as well.

FYI @wiedld and @Omega359 as I think both of you have looked at the code that implements the algorithm described in this post

@alamb alamb changed the title Order Requirement Analysis Blog: Using Ordering for Better Plans in Apache DataFusion Mar 10, 2025
@akurmustafa
Copy link
Contributor Author

Thanks @alamb for the detailed feedback. I have incorporated your suggestions. Thanks @Omega359 also for catching the typos, I have fixed them in the commit.
I did a proof read to see whether formatting, rendering, links are OK? However, second look is always welcome.

@alamb
Copy link
Contributor

alamb commented Mar 11, 2025

Thank you so much @akurmustafa and @ozankabak -- I pushed two commits with minor changes (spelling/code formatting) and then to update the date to today.

I think this is ready to publish!

Copy link
Contributor

@Omega359 Omega359 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor grammatical updates, one typo.

@akurmustafa
Copy link
Contributor Author

Minor grammatical updates, one typo.

Thanks @Omega359, for such a detailed read-through. I have added those changes in the commit

@alamb
Copy link
Contributor

alamb commented Mar 12, 2025

Thanks again @akurmustafa @Omega359 and @ozankabak -- I think this one looks really nice

@alamb alamb merged commit 67ab720 into apache:main Mar 12, 2025
1 check passed
@alamb
Copy link
Contributor

alamb commented Mar 12, 2025

It is posted to the site now: https://datafusion.apache.org/blog/2025/03/11/ordering-analysis/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants