-
Notifications
You must be signed in to change notification settings - Fork 43
Slow incremental build/test workflow. #4009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Some additional tests, also running on illumos. System Specifications:Ryzen 5950x (16c / 32t, Base Clock 3.4GHz, Boost up to 4.9Ghz) Clean and Build roundtrip times
ObservationsLooks like Actual test runtime took ~2 min 49 seconds, so the increase in ptime stats seems to also be mostly due to building the additional test binaries. |
Running with
I was watching |
NOTE TO READER: THE FOLLOWING ARE ALL ON A 32 GiB, 16 CORE LINUX SYSTEM I tried setting up an incremental rebuild of Nexus, using the following command:
I observed build times of roughly 50 seconds, with the following There are a couple takeaways from this:
This seems really odd to me, in particular for Running the following command, I tried to inspect the schema-updater binary size:
(The Nexus binary itself is very similar in size, coming in at 2.2 GiB) The default
Additionally, the total build time seems to be reduced a fair bit: |
I'm seeing a pretty good bump from this. Using a fresh checkout of omicron:
Then setting
Edit: Ryzen 3950x (16c / 32t, base: 3.5GHz, boost: 4.7GHz) |
Seeing a big improvement from |
This did not result in a large difference for me.
|
I tried the same workflow on Linux. The following is for a one-line saga change recompile.
I do not have
This machine is significantly less capable than my regular dev machine. It has a 6-core Xeon Mobile E-2176M processor and 64G of RAM. |
Using #
# makes chanes in file
#
ry@rytk2:~/src/omicron$ time cargo nextest run --no-run
Compiling omicron-nexus v0.1.0 (/home/ry/src/omicron/nexus)
Compiling omicron-dev-tools v0.1.0 (/home/ry/src/omicron/dev-tools)
Finished test [unoptimized + debuginfo] target(s) in 1m 07s
real 1m7.938s
user 3m33.098s
sys 0m38.624s
ry@rytk2:~/src/omicron$ time cargo nextest run --no-run
Compiling omicron-nexus v0.1.0 (/home/ry/src/omicron/nexus)
Compiling omicron-dev-tools v0.1.0 (/home/ry/src/omicron/dev-tools)
Finished test [unoptimized + debuginfo] target(s) in 44.23s
real 0m44.624s
user 1m8.197s
sys 0m18.583s
ry@rytk2:~/src/omicron$ time cargo nextest run --no-run
Finished test [unoptimized + debuginfo] target(s) in 0.85s
real 0m1.223s
user 0m1.470s
sys 0m1.232s |
One more data point: [profile.dev]
debug = "line-tables-only" # This required an update to 1.71.0! This still preserves backtraces, but otherwise removes all debug info from the binaries. Considering that the default option for release builds already removes all this debug info, I think this is a worthwhile inclusion. I'll put up a PR soon with this change, which should be one small step towards alleviating this pain during incremental development. |
Depends on #4025 Improves incremental rebuilds of Nexus by about ~50% for debug builds on Linux, through a judicious reduction of debug symbols. This also reduces the size of binaries linking against Nexus significantly. The benefits here seem like they'd be useful across the stack, so I've made this a toggle for the top-level workspace. I'm open to discussion about whether this should be applied more specifically to Nexus. Part of #4009
Hey folks! Sorry I was out Friday. I caught up on the recording. I indeed was putting some time into this directly, back in like, March. What was frustrating is that everything I tried showed absolutely no improvements, which is why you didn't see any patches from me. I was putting together a slide deck, expecting to show it off at demo day, so I apologize for screenshots rather than saving the text of this stuff. Here's what I tried and what I found at the time: First of all, it was suggested to me that I mostly try and build nexus, so I used it as a baseline. That is, for "dirty" builds, I would change lines in omicron-nexus and rebuild. First, I wanted to get some baseline numbers of where compilation time was taking. A dirty rebuild was spending the vast, vast majority of its time in codegen, both debug: So I decided to focus my initial work on bringing codegen time down. I spent a lot of time figuring out how to use various exotic compilation options, benchmarking them, and comparing them. There was a general suspicion at the time that generics and monomorphization was a significant portion of the compile-time budget, due to diesel and other things. I tried out the "share generics" optimization: ![]() This ended up with basically the same numbers as before; I should re-look into this one, however, because in my notes I just have "build with shared generics" and not if it was debug or release; these are the same times as the debug build, which would make sense given https://github.com/rust-lang/rust/blob/master/compiler/rustc_session/src/config.rs#L1077-L1080 It is possible trying this again for release builds would be of interest, however it does require a nightly compiler to turn this on, as do many of the exotic options I was trying. Next up, I wanted to validate that in fact monomorphization was a problem.
By default it sorts by size, that is, which functions have the largest bodies. So let's see: ![]() We have:
So that all makes sense, and confirms what we suspected. However... there's a problem with this analysis. Let's look at the bottom of the list: ![]() A lot of very similar functions end up having different names, due to closures having different names, so their symbols are slightly different, and so this doesn't really give a super accurate picture, because some functions are basically duplicated in a way that llvm-lines won't see. We can fix that though: ![]() To which you may say "holy crap why is register_endpoint a big deal here? It is twice as large as the next function" Well if we look at that function at the time: ![]() We'll come back to this one later. So what about number of copies? Well: ![]() Our largest three function by number of copies is async machinery. Not much we can do there, that I'm aware of. In general, I would characterize this as like:
but even the stuff with the most copies is under 2%. Of course, if we group them together, like all the 'async machinery' functions, they'd be larger, but like... there's no easy wins here. What about with our special name mangling? ![]() why is validating an error our top result here? Well earlier we suspected ![]()
This helped, but only a small amount, and we've already done it for the largest function by far. This leads me to believe that trying to deal with the largest functions won't be particularly productive in reducing codgen time, as we've basically already taken all the low-hanging fruit with this one thing. At this point I felt like I had exhausted the
Let's see what ![]() We have similar results to our other tooling: optimization is taking 20% of the time, LTO is taking %17 + 14%, then various codegen things and by the time we get to type checking we're down to 0.5% of the total time. This continues to tell me that codegen is just simply slow. Which is not helpful, but I guess it's nice that tools aren't contradicting each other? Next I tried out ![]() It's kind of interesting that Cargo bloat does have one more interesting option: ![]() ![]()
At this point I had exhausted all of my research on how to tackle this problem, and was sick of making my machine (and atrium for that matter) cry by building everything over and over and over again. So I left it there. I still am not 100% sure what will bring build times down significantly. It is possible Rust is simply this slow to compile, when building large projects that rely on generics as heavily as we do. I hope that that isn't totally true, but at least from this line of investigation I did, I'm not sure what more we can do there, specificialy. |
I think it would be really helpful to close the gap between illumos and Linux. Incremental test compilation for saga changes on Linux is around a minute and around ten on illumos. I'm fine with a longer wait upfront if iteration can move quickly after an initial compile. |
I'm curious about your time-based cargo-bloat result, @steveklabnik. In particular, lalrpop and polar-core seem to both be part of the oso authnz stuff:
Is it possible that oso (and, critically, code generated by oso policies) is actually what's vastly expensive to compile? |
That's consistent with my experience with lalrpop. I tried using lalrpop in the front end of the P4 compiler initially, and the compile times were unreasonably slow. |
@jclulow I did find it a bit odd, as these packages never appeared in my other analyses. Then again I was focused on incremental compile-times of omicron-nexus, which means I don't think it would have showed up in those graphs. |
My fear with a lot of the stuff which is basically code generation (serde, lalrpop, diesel, etc) is that it can be extremely difficult to detect and correctly categorise the impact in at least two dimensions:
I also wonder, wrt. to our linker challenges, how many sections and symbols are produced in expensive code generation only then to be discarded during the expensive link step, getting us both coming and going as they say. |
I will now note that
So... yeah. As always these tools are mostly heuristics, useful for tracking down leads, but are also sometimes misleading. |
See also: #1122. |
More anecdata (from Oct 11, so this is a bit old):
Here's the actual sequence:
@sunshowers asked for |
tl;dr; A single-line change in a saga has a 10-minute build penalty for running tests. I'm developing on a 64-core machine with 256G of RAM using NVMe disks with trim enabled.
I run tests with
cargo nextest run
. I do not typically use-p
as it's not always clear what packages my changes may intersect with. In these cases, I feel like the build system should be the one figuring out what needs to be rebuilt and not the user. It's also not clear why a change in a saga, which is seemingly near the top of the build pyramid, would cause a rebuild of unrelated testing packages.Here is what a test run looks like with no-compilation, e.g. just running the tests.
Here is a one-line change to saga code.
Within that time the build time is the majority as reported by cargo.
The text was updated successfully, but these errors were encountered: