Open
Description
When differentiating with respect to an empty array, the results tend to vary:
using DifferentiationInterface, ForwardDiff, ReverseDiff, Mooncake, Enzyme
ADTYPES = [
AutoForwardDiff(),
AutoReverseDiff(),
AutoMooncake(; config=nothing),
AutoEnzyme(; mode=Forward),
AutoEnzyme(; mode=Reverse),
# and more...
]
for adtype in ADTYPES
DifferentiationInterface.value_and_gradient(sum, adtype, Float64[])
end
ReverseDiff, Mooncake, and reverse Enzyme all happily return (0.0, [])
😄
Forward Enzyme tries to use a batch size of 0 and errors:
And ForwardDiff tries to construct a GradientResult
which errors:
Funnily enough gradient
with ForwardDiff (rather than value_and_gradient
) is fine because it doesn't try to construct the GradientResult
. I imagine the other operators would also have varying behaviour.
I suppose it is a bit of a trivial edge case, but would it be possible to unify the behaviour of the AD backends?