-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Minor: add a sql_planner benchmarks to reflecte select many field on a huge table #9536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -194,6 +196,16 @@ fn criterion_benchmark(c: &mut Criterion) { | |||
b.iter(|| physical_plan(&ctx, "SELECT c1 FROM t700")) | |||
}); | |||
|
|||
// Test simplest | |||
c.bench_function("logical_select_all_from_1000", |b| { | |||
b.iter(|| logical_plan(&ctx, "SELECT * FROM t1000")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obviously 1000 > 700, but I'm wondering if we should just reuse the t700 table which already intents to be a "huge" table?
So the query would be "SELECT * FROM t700"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me @haohuaijin -- thank you. I think @simonvandel 's suggestion to use a 700 col table is worth considering too.
@simonvandel and @alamb, I chose 1000 instead of 700 because I noticed that the plan time does not increase linearly with the number of columns. This is evident from the benchmarks shown below. I separately ran benchmarks for select * from t700, t1000, and t1500. The plan time increased from 289ms for t700 to 1.3s for t1500. This is a 5x slowdown, even though the number of columns only doubled. But I don't have a strong opinion on wanting to keep 1000, 700 also mean selecting many columns in a huge table. Do I need to change it?
|
Your rationale makes a lot of sense to me. I think 1000 is totally fine -- thanks @haohuaijin and thank you @simonvandel |
Which issue does this PR close?
Closes #.
Rationale for this change
I find current sql_planner doesn't have benchmarks that reflect the query type that selects many fields on a huge table(more than 1000). maybe this also can reflect the query type describe in #7698.
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?