Skip to content

feat: add benchmark for effect of CSE on jacobian computation/compilation #1178

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 32 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
2ac29c8
feat: add BCR jacobian benchmark
AayushSabharwal Feb 20, 2025
0e74194
fix: fix preallocation of buffers in BCR benchmark
AayushSabharwal Mar 13, 2025
ac0c0a1
feat: add ThermalFluid jacobian scaling benchmark
AayushSabharwal Mar 13, 2025
5181954
feat: add plots for Symbolics/ThermalFluid benchmark
AayushSabharwal Mar 19, 2025
cdd4b28
feat: track allocation statistics, measure with and without hashconsing
AayushSabharwal Mar 19, 2025
a15f265
build: bump Symbolics, SymbolicUtils compats
AayushSabharwal Mar 19, 2025
b2fd9c5
refactor: improve ThermalFluid benchmark
AayushSabharwal Mar 19, 2025
d5cd9d1
fix: fix BCR benchmark
AayushSabharwal Mar 19, 2025
f349593
fix: fix BCR benchmark
AayushSabharwal Mar 20, 2025
274b3fa
build: bump Symbolics, SymbolicUtils compats
AayushSabharwal Mar 23, 2025
ccd7a10
refactor: use `mean` for ThermalFluid timings
AayushSabharwal Mar 23, 2025
d161cab
refactor: avoid registering trivial functions in ThermalFluid
AayushSabharwal Mar 23, 2025
00cb308
build: bump Symbolics compat
AayushSabharwal Mar 23, 2025
39889ca
refactor: run for smaller `N`, add logging
AayushSabharwal Mar 24, 2025
47b41c9
fix: fix first call time calculation
AayushSabharwal Mar 24, 2025
de7ff53
fix: use `eval` instead of RGFs
AayushSabharwal Mar 24, 2025
8220e42
fix: remove debugging
AayushSabharwal Mar 24, 2025
8f9013d
fix: run for more sizes
AayushSabharwal Mar 24, 2025
be70a60
fix: run for more sizes
AayushSabharwal Mar 24, 2025
ca6d674
fix: clear `occursin_info` cache in `@be` macro
AayushSabharwal Mar 24, 2025
dcf628e
refactor: rewrite benchmarks to account for new derivative calculation
AayushSabharwal Mar 26, 2025
ae3f7b6
build: bump Symbolics compat
AayushSabharwal Mar 26, 2025
2b70c8b
TEMP run benchmark with uint-id
AayushSabharwal Apr 2, 2025
a7b8ba9
fixup! TEMP run benchmark with uint-id
AayushSabharwal Apr 3, 2025
fb66afb
refactor: update BCR benchmark visualization with improved axes and l…
bowenszhu Apr 9, 2025
c6e1468
refactor: enhance ThermalFluid benchmark visualization with improved …
bowenszhu Apr 9, 2025
6a63ed8
Merge pull request #1 from bowenszhu/as/cse-bench
AayushSabharwal Apr 10, 2025
8287ac4
build: bump SymbolicUtils compat
AayushSabharwal Apr 14, 2025
793306a
fix: remove usage of non-primary SymbolicUtils branch
AayushSabharwal Apr 14, 2025
e4bd0fa
TEMP: use branch of SymbolicUtils
AayushSabharwal Apr 17, 2025
ca0924a
fixup! TEMP: use branch of SymbolicUtils
AayushSabharwal Apr 17, 2025
2f46446
fixup! TEMP: use branch of SymbolicUtils
AayushSabharwal Apr 18, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
244 changes: 244 additions & 0 deletions benchmarks/Symbolics/BCR.jmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,244 @@
---
title: BCR Symbolic Jacobian
author: Aayush Sabharwal, Bowen Zhu, Chris Rackauckas
---

The following benchmark is of 1122 ODEs with 24388 terms that describe a stiff
chemical reaction network modeling the BCR signaling network from [Barua et
al.](https://doi.org/10.4049/jimmunol.1102003). We use
[`ReactionNetworkImporters`](https://github.com/isaacsas/ReactionNetworkImporters.jl)
to load the BioNetGen model files as a
[Catalyst](https://github.com/SciML/Catalyst.jl) model, and then use
[ModelingToolkit](https://github.com/SciML/ModelingToolkit.jl) to convert the
Catalyst network model to ODEs.

The resultant large model is used to benchmark the time taken to compute a symbolic
jacobian, generate a function to calculate it and call the function.


```julia
using Catalyst, ReactionNetworkImporters,
TimerOutputs, LinearAlgebra, ModelingToolkit, Chairmarks,
LinearSolve, Symbolics, SymbolicUtils, SymbolicUtils.Code, SparseArrays, CairoMakie,
PrettyTables

using Pkg
Pkg.add(Pkg.PackageSpec(; name = "SymbolicUtils", rev="9d52810c547276a60c0aae502ffee7f93ff99143"))

datadir = joinpath(dirname(pathof(ReactionNetworkImporters)),"../data/bcr")
const to = TimerOutput()
tf = 100000.0

# generate ModelingToolkit ODEs
prnbng = loadrxnetwork(BNGNetwork(), joinpath(datadir, "bcr.net"))
show(to)
rn = complete(prnbng.rn; split = false)
obs = [eq.lhs for eq in observed(rn)]
osys = convert(ODESystem, rn)

rhs = [eq.rhs for eq in full_equations(osys)]
vars = unknowns(osys)
pars = parameters(osys)
```

The `sparsejacobian` function in Symbolics.jl is optimized for hashconsing and caching, and as such
performs very poorly without either of those features. We use the old implementation, optimized without
hashconsing, to benchmark performance without hashconsing and without caching to avoid biasing the results.

```julia
include("old_sparse_jacobian.jl")
```

```julia
SymbolicUtils.ENABLE_HASHCONSING[] = false
@timeit to "Calculate jacobian - without hashconsing" jac_nohc = old_sparsejacobian(rhs, vars);
SymbolicUtils.ENABLE_HASHCONSING[] = true
Symbolics.toggle_derivative_caching!(false)
Symbolics.clear_derivative_caches!()
@timeit to "Calculate jacobian - hashconsing, without caching" jac_hc_nocache = old_sparsejacobian(rhs, vars);
Symbolics.toggle_derivative_caching!(true)
for fn in Symbolics.cached_derivative_functions()
stats = SymbolicUtils.get_stats(fn)
@assert stats.hits == stats.misses == 0
end
Symbolics.clear_derivative_caches!()
@timeit to "Calculate jacobian - hashconsing and caching" jac_hc_cache = Symbolics.sparsejacobian(rhs, vars);

@assert isequal(jac_nohc, jac_hc_nocache)
@assert isequal(jac_hc_nocache, jac_hc_cache)

jac = jac_hc_cache
args = (vars, pars, ModelingToolkit.get_iv(osys))
# out of place versions run into an error saying the expression is too large
# due to the `SymbolicUtils.Code.create_array` call. `iip_config` prevents it
# from trying to build the function.
kwargs = (; iip_config = (false, true), expression = Val{true})
@timeit to "Build jacobian - no CSE" _, jac_nocse_iip = build_function(jac, args...; cse = false, kwargs...);
@timeit to "Build jacobian - CSE" _, jac_cse_iip = build_function(jac, args...; cse = true, kwargs...);

jac_nocse_iip = eval(jac_nocse_iip)
jac_cse_iip = eval(jac_cse_iip)

defs = defaults(osys)
u = Float64[Symbolics.fixpoint_sub(var, defs) for var in vars]
buffer_cse = similar(jac, Float64)
buffer_nocse = similar(jac, Float64)
p = Float64[Symbolics.fixpoint_sub(par, defs) for par in pars]
tt = 0.0

@timeit to "Compile jacobian - CSE" jac_cse_iip(buffer_cse, u, p, tt)
@timeit to "Compute jacobian - CSE" jac_cse_iip(buffer_cse, u, p, tt)

@timeit to "Compile jacobian - no CSE" jac_nocse_iip(buffer_nocse, u, p, tt)
@timeit to "Compute jacobian - no CSE" jac_nocse_iip(buffer_nocse, u, p, tt)

@assert isapprox(buffer_cse, buffer_nocse, rtol = 1e-10)

show(to)
```

We'll also measure scaling.


```julia
function run_and_time_construct!(rhs, vars, pars, iv, N, i, jac_times, jac_allocs, build_times, functions)
outputs = rhs[1:N]
SymbolicUtils.ENABLE_HASHCONSING[] = false
jac_result = @be old_sparsejacobian(outputs, vars)
jac_times[1][i] = minimum(x -> x.time, jac_result.samples)
jac_allocs[1][i] = minimum(x -> x.bytes, jac_result.samples)

SymbolicUtils.ENABLE_HASHCONSING[] = true
jac_result = @be (Symbolics.clear_derivative_caches!(); Symbolics.sparsejacobian(outputs, vars))
jac_times[2][i] = minimum(x -> x.time, jac_result.samples)
jac_allocs[2][i] = minimum(x -> x.bytes, jac_result.samples)

jac = Symbolics.sparsejacobian(outputs, vars)
args = (vars, pars, iv)
kwargs = (; iip_config = (false, true), expression = Val{true})

build_result = @be build_function(jac, args...; cse = false, kwargs...);
build_times[1][i] = minimum(x -> x.time, build_result.samples)
jacfn_nocse = eval(build_function(jac, args...; cse = false, kwargs...)[2])

build_result = @be build_function(jac, args...; cse = true, kwargs...);
build_times[2][i] = minimum(x -> x.time, build_result.samples)
jacfn_cse = eval(build_function(jac, args...; cse = true, kwargs...)[2])

functions[1][i] = let buffer = similar(jac, Float64), fn = jacfn_nocse
function nocse(u, p, t)
fn(buffer, u, p, t)
buffer
end
end
functions[2][i] = let buffer = similar(jac, Float64), fn = jacfn_cse
function cse(u, p, t)
fn(buffer, u, p, t)
buffer
end
end

return nothing
end

function run_and_time_call!(i, u, p, tt, functions, first_call_times, second_call_times)
jacfn_nocse = functions[1][i]
jacfn_cse = functions[2][i]

call_result = @timed jacfn_nocse(u, p, tt)
first_call_times[1][i] = call_result.time
call_result = @timed jacfn_cse(u, p, tt)
first_call_times[2][i] = call_result.time

call_result = @be jacfn_nocse(u, p, tt)
second_call_times[1][i] = minimum(x -> x.time, call_result.samples)
call_result = @be jacfn_cse(u, p, tt)
second_call_times[2][i] = minimum(x -> x.time, call_result.samples)
end
```

# Run benchmark

```julia
Chairmarks.DEFAULTS.seconds = 15.0
N = [10, 20, 40, 80, 160, 320]
jacobian_times = [zeros(Float64, length(N)), zeros(Float64, length(N))]
jacobian_allocs = copy.(jacobian_times)
functions = [Vector{Any}(undef, length(N)), Vector{Any}(undef, length(N))]
# [without_cse_times, with_cse_times]
build_times = copy.(jacobian_times)
first_call_times = copy.(jacobian_times)
second_call_times = copy.(jacobian_times)

iv = ModelingToolkit.get_iv(osys)
run_and_time_construct!(rhs, vars, pars, iv, 10, 1, jacobian_times, jacobian_allocs, build_times, functions)
run_and_time_call!(1, u, p, tt, functions, first_call_times, second_call_times)
for (i, n) in enumerate(N)
@info i n
run_and_time_construct!(rhs, vars, pars, iv, n, i, jacobian_times, jacobian_allocs, build_times, functions)
end
for (i, n) in enumerate(N)
@info i n
run_and_time_call!(i, u, p, tt, functions, first_call_times, second_call_times)
end
```

# Plot figures

```julia
tabledata = hcat(N, jacobian_times..., jacobian_allocs..., build_times..., first_call_times..., second_call_times...)
header = ["N", "Jacobian time (no hashconsing)", "Jacobian time (hashconsing)", "Jacobian allocated memory (no hashconsing) (B)", "Jacobian allocated memory (hashconsing) (B)", "`build_function` time (no CSE)", "`build_function` time (CSE)", "First call time (no CSE)", "First call time (CSE)", "Second call time (no CSE)", "Second call time (CSE)"]
pretty_table(tabledata; header, backend = Val(:html))
```

```julia
f = Figure(size = (750, 400))
titles = [
"Jacobian symbolic computation", "Jacobian symbolic computation", "Code generation",
"Numerical function compilation", "Numerical function evaluation"]
labels = ["Time (seconds)", "Allocated memory (bytes)",
"Time (seconds)", "Time (seconds)", "Time (seconds)"]
times = [jacobian_times, jacobian_allocs, build_times, first_call_times, second_call_times]
axes = Axis[]
for i in 1:2
label = labels[i]
data = times[i]
ax = Axis(f[1, i], xscale = log10, yscale = log10, xlabel = "model size",
xlabelsize = 10, ylabel = label, ylabelsize = 10, xticks = N,
title = titles[i], titlesize = 12, xticklabelsize = 10, yticklabelsize = 10)
push!(axes, ax)
l1 = scatterlines!(ax, N, data[1], label = "without hashconsing")
l2 = scatterlines!(ax, N, data[2], label = "with hashconsing")
end
Legend(f[1, 3], axes[1], "Methods", tellwidth = false, labelsize = 12, titlesize = 15)
axes2 = Axis[]
# make equal y-axis unit length
mn3, mx3 = extrema(reduce(vcat, times[3]))
xn3 = log10(mx3 / mn3)
mn4, mx4 = extrema(reduce(vcat, times[4]))
xn4 = log10(mx4 / mn4)
mn5, mx5 = extrema(reduce(vcat, times[5]))
xn5 = log10(mx5 / mn5)
xn = max(xn3, xn4, xn5)
xn += 0.2
hxn = xn / 2
hxn3 = (log10(mx3) + log10(mn3)) / 2
hxn4 = (log10(mx4) + log10(mn4)) / 2
hxn5 = (log10(mx5) + log10(mn5)) / 2
ylims = [(exp10(hxn3 - hxn), exp10(hxn3 + hxn)), (exp10(hxn4 - hxn), exp10(hxn4 + hxn)),
(exp10(hxn5 - hxn), exp10(hxn5 + hxn))]
for i in 1:3
ir = i + 2
label = labels[ir]
data = times[ir]
ax = Axis(f[2, i], xscale = log10, yscale = log10, xlabel = "model size",
xlabelsize = 10, ylabel = label, ylabelsize = 10, xticks = N,
title = titles[ir], titlesize = 12, xticklabelsize = 10, yticklabelsize = 10)
ylims!(ax, ylims[i]...)
push!(axes2, ax)
l1 = scatterlines!(ax, N, data[1], label = "without hashconsing")
l2 = scatterlines!(ax, N, data[2], label = "with hashconsing")
end
save("bcr.pdf", f)
f
```
25 changes: 25 additions & 0 deletions benchmarks/Symbolics/Project.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
[deps]
BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
Catalyst = "479239e8-5488-4da2-87a7-35f2df7eef83"
Chairmarks = "0ca39b1e-fe0b-4e98-acfc-b1656634c4de"
DataStructures = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
LinearSolve = "7ed4a6bd-45f5-4d41-b270-4a48e9bafcae"
ModelingToolkit = "961ee093-0014-501f-94e3-6117800e7a78"
OrderedCollections = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
OrdinaryDiffEq = "1dea7af3-3e70-54e6-95c3-0bf5283fa5ed"
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
Polynomials = "f27b6e38-b328-58d1-80ce-0feddd5e7a45"
PrettyTables = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d"
ReactionNetworkImporters = "b4db0fb7-de2a-5028-82bf-5021f5cfa881"
SciMLBenchmarks = "31c91b34-3c75-11e9-0341-95557aab0344"
SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
SymbolicUtils = "d1185830-fcd6-423d-90d6-eec64667417b"
Symbolics = "0c5d862f-8b57-4792-8d23-62f2024744c7"
TimerOutputs = "a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f"
XSteam = "95ff35a0-be81-11e9-2ca3-5b4e338e8476"

[compat]
ModelingToolkit = "9"
SymbolicUtils = "3.26"
Symbolics = "6.34"
Loading