Skip to content

Function overload database is slow #7570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jimblandy opened this issue Apr 18, 2025 · 0 comments
Open

Function overload database is slow #7570

jimblandy opened this issue Apr 18, 2025 · 0 comments
Labels
area: performance How fast things go naga Shader Translator

Comments

@jimblandy
Copy link
Member

jimblandy commented Apr 18, 2025

#6833, which improved Naga's WGSL compatibility a lot, costs more in performance than it seems like it should.

Although it's not our priority right now, I was feeling rebellious so I took a few hours to hook up the perf_event crate to Naga and count cycles and instructions spent in the WGSL front end and the validator. (The perf_event crate only works on Linux.) I rebased this on the wgpu trunk commit just before #6833, and then rebased all of trunk on that, so I could select any commit since then and get comparable timings.

Using some large WGSL generated from SPV, the change across landing #6833 was

  • in the WGSL front end, 15.047M instructions to 16.204M instructions (+7%)
  • in the validator, 0.775M instructions to 1.828M instructions (+135%)

Note that validation seems to take about 5% as much time as parsing, so the larger percentage change there is less significant. In both cases the absolute change is around 1M instructions, which makes sense: they're both visiting the same set of math operations.

I don't know whether it's reasonable to expect Naga to do more work (computing the right type conversions) without any loss of performance. But it also doesn't seem to me that this should double the cost of validation.

I think most users are willing to pay a certain amount for standards conformance. The question is whether it is really necessary, or whether a different design - say, letting OverloadSet take all the argument types simultaneously, and choosing a different way to report errors - would let us have our cake and eat it too.

I've pushed my instrumented Naga trunk here: https://github.com/jimblandy/wgpu/tree/perf-trunk

If you simply run the naga CLI, it will log counts:

naga$ cargo run --release -p naga-cli -- a-large-shader.wgsl 
    Finished `release` profile [optimized + debuginfo] target(s) in 0.14s
     Running `/home/jimb/wgpu/target/release/naga a-large-shader.wgsl`
[2025-04-18T07:42:10Z INFO  naga::benchmark::perf_event] wgsl_in: 5.645 Mcycles / 16.205 Minsns =  0.35 cycles/insn
[2025-04-18T07:42:10Z INFO  naga::benchmark::perf_event] valid::validate: 0.956 Mcycles / 1.828 Minsns =  0.52 cycles/insn
Validation successful
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: performance How fast things go naga Shader Translator
Projects
Status: Todo
Development

No branches or pull requests

1 participant