Correlated subqueries have incredibly poor performance #244
Replies: 5 comments 4 replies
-
I… don't know. Can look into it if you're willing to offer a reproducer. Otherwise, I'd at least look at SQLite versions, and It would be surprising if (just) Wasm was the cause for 2 orders of magnitude slowness. |
Beta Was this translation helpful? Give feedback.
-
I realized there wasn't an index on _, err = db.Exec(`
create index comment_post_id_idx on comments (post_id);
`)
if err != nil {
return err
} and changed the post count from 100 to 1,000 (and commented out three of the four test cases to focus on just the worst case): // {100, 1, 25, 1},
// {100, 1, 25, 100},
// {100, 100, 25, 1},
{1000, 100, 25, 100}, With those two changes memory allocations decreased dramatically, but relative performance became even worse:
ncruces is ~130x slower than modernc and ~230x slower than mattn and tailscale. My previous theory was a plausible explanation for memory usage — the lack of an index was causing table scans, etc. — but it doesn't explain this. |
Beta Was this translation helpful? Give feedback.
-
I'll look into it when I can.
Also, as I said, |
Beta Was this translation helpful? Give feedback.
-
I'm not sure if correlated subqueries are fast or slow, but that's not what's being measured here. Take this modified benchmark: func BenchmarkDB(b *testing.B) {
b.Run("", func(b *testing.B) {
db, err := OpenDB(":memory:")
if err != nil {
b.Fatal(err)
}
defer db.Close()
var version string
err = db.QueryRow(`SELECT sqlite_version();`).Scan(&version)
if err != nil {
b.Fatal(err)
}
b.ResetTimer()
b.RunParallel(func(pb *testing.PB) {
for pb.Next() {
_, err := db.Exec(`WITH RECURSIVE c(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM c WHERE x < 1000000) SELECT x FROM c;`)
if err != nil {
b.Fatal(err)
}
}
})
})
} That query just iterates from 1 to 1 million. Results:
Looks terrible. The reason for this, is that my driver's All the other drivers step through the query once. They don't even throw an error if your query returns (multiple) rows. I find this absurd, and stand by my implementation. While my driver was stuck reading your 2GB database in a sec to produce 1000 results, other drivers were returning one result from page cache in 3ms. Yeah, that happens. |
Beta Was this translation helpful? Give feedback.
-
OK, I have an answer. ncruces implements ncruces calls SQLite's exec func. SQLite's exec func steps through all of the results. modernc, mattn and tailscale all role their own exec funcs. Their exec funcs step the results at most once. (I think ncruces' Here are the times for:
In the case of ncruces, But that's with 1 cpu. All of the drivers improve performance with additional CPUs — ncruces more than the rest:
With 2 CPUs the times are faster but the previous ratios still hold. With 12 CPUs, though... ncruces matches modernc, is 2.5x faster than mattn, and is only 25% slower than tailscale. Another thing to notice is that ncruces' memory usage also scales with CPUs, going from 690 allocs/op with 2 CPUs to 1,129 allocs/op with 12 CPUs. The other drivers also increase allocs/op, but not so drastically. Finally, ncruces uses substantially more memory than modernc, the other Go-based driver: ~325KB vs ~15KB. === I think we can close this, but I'll wait to see if there's anything you want to add first. |
Beta Was this translation helpful? Give feedback.
-
I've been comparing ncruces against other Go-based SQLite packages, mostly to make sure I understand the tradeoffs.
Most of the results have been along the lines of what I expected, but today I came across something rather unexpected: correlated subqueries have incredibly poor performance.
This query:
produces these results in ncruces, modernc, and tailscale:
With one thread, ncruces is ~120x slower than modernc, and ~235x slower than tailscale (a CGO implementation).
I've looked at compile options, and I've looked at pragmas. I haven't come across anything that would explain this result.
I'm beginning to wonder if this is a WASM issue?
Here's something from the SQLite docs:
I don't know how SQLite implements co-routines, but maybe, maybe, the implementation is not WASM-friendly?
Just a guess.
Thoughts?
(Btw, the
posts
table has 100 posts, and 20 comments per post. It's not big!)Beta Was this translation helpful? Give feedback.
All reactions