Skip to content

Define private dot function to avoid type-piracy #749

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

tkf
Copy link
Contributor

@tkf tkf commented Sep 17, 2019

As of JuliaLang/julia#32739, Julia has 3-arg dot function. Using Optim master with Julia 1.4-DEV produces a warning

WARNING: Method definition dot(Any, Any, Any) in module LinearAlgebra at /.../stdlib/v1.4/LinearAlgebra/src/generic.jl:903 overwritten in module Optim at /.../Optim/src/multivariate/precon.jl:23.

This PR fixes the warning by defining a private function Optim.dot.

@fredrikekre
Copy link
Contributor

#748

@codecov
Copy link

codecov bot commented Sep 17, 2019

Codecov Report

Merging #749 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #749      +/-   ##
==========================================
+ Coverage   81.53%   81.53%   +<.01%     
==========================================
  Files          43       43              
  Lines        2431     2432       +1     
==========================================
+ Hits         1982     1983       +1     
  Misses        449      449
Impacted Files Coverage Δ
src/Optim.jl 100% <100%> (ø) ⬆️
src/multivariate/solvers/constrained/samin.jl 75.91% <0%> (-0.73%) ⬇️
src/multivariate/precon.jl 70% <0%> (+10%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 20b2994...4d29e0c. Read the comment docs.

@tkf
Copy link
Contributor Author

tkf commented Sep 17, 2019

Thanks, I didn't notice there is another PR. I'd actually argue this PR is better because it fixes the root cause (= type-piracy). But I guess it's up to Optim devs to choose, in the end.

@pkofod
Copy link
Member

pkofod commented Sep 19, 2019

Thanks. I understand your points, but I merged the other PR. Either way, you should be able to use master Optim on master now :)

@pkofod pkofod closed this Sep 19, 2019
@tkf
Copy link
Contributor Author

tkf commented Sep 19, 2019

@pkofod I appreciate that the immediate problem is quickly resolved. But let me point out type-pirating LinearAlgebra.dot still has some problems. For example, this definition

dot(A::Array, P::Diagonal, B::Array) = dot(A, P.diag .* B)

overwrites the definition in LinearAlgebra:

function dot(x::AbstractVector, D::Diagonal, y::AbstractVector)
    mapreduce(t -> dot(t[1], t[2], t[3]), +, zip(x, D.diag, y))
end

The definition in Optim allocates an intermediate array. Also, IIRC, mapreduce above actually use mapfoldl instead of recursion-based reduce (i.e. not using pairwise summation). So, the exact answer of dot(::Vector, P::Diagonal, B::Vector) can be different before and after importing Optim:

julia> using LinearAlgebra

julia> x = rand(1000);

julia> y = rand(1000);

julia> D = Diagonal(rand(1000));

julia> dot(x, D, y)
127.29363992864297

julia> using Optim

julia> dot(x, D, y)
127.29363992864292

I think this can introduce very subtle bugs which are hard to debug.

@pkofod
Copy link
Member

pkofod commented Sep 19, 2019

So do you know what version of LinearAlgebra introduced this? Same commit as the other one? I think the solution is just to wrap that definition as well.

I get your point, we could just define our own dot, but long term (post 1.4) that's not what we want to do, we want to import and if necessary extend (no need for piracy as base/stdlib now has these) LinearAlgebra.dot. So I think wrapping definitions in if blocks is fine, the one you pointed out was just missing.

Sorry I'm not actively fixing this myself, I'm just not ready for 1.4 yet :)

@tkf
Copy link
Contributor Author

tkf commented Sep 20, 2019

Sorry if it sounds very nit-picky, but Optim also defines dot(A, ::Nothing, B) which is also a type-piracy (although arguably a safe one). I'm OK with type-piracy when it's the only sensible solution. I just don't think you need it here.

Same commit as the other one?

I think so. I'll check it and open a PR.

@pkofod
Copy link
Member

pkofod commented Sep 20, 2019

Sorry if it sounds very nit-picky

That's fine, happy to discuss. (though its late here :) )

@tkf
Copy link
Contributor Author

tkf commented Sep 20, 2019

By the way, I noticed that ldiv! is also doing type-piracy 😈. I think it can also be avoided by a similar code.

A better long-term option might be to use https://github.com/JuliaArrays/LazyArrays.jl as handling something like ldiv!(out::Array, P::InverseDiagonal, A::Array) is in its scope.

@pkofod
Copy link
Member

pkofod commented Sep 20, 2019

By the way, I noticed that ldiv! is also doing type-piracy 😈. I think it can also be avoided by a similar code.

😑

A better long-term option might be to use https://github.com/JuliaArrays/LazyArrays.jl as handling something like ldiv!(out::Array, P::InverseDiagonal, A::Array) is in its scope.

Sounds reasonable!

@antoine-levitt
Copy link
Contributor

Yeah or LinearMaps, with a function call to apply the ldiv (which I think LinearMaps does not actually support easily now, though)

@tkf
Copy link
Contributor Author

tkf commented Sep 20, 2019

I guess it's not hard to hook LinearMaps into LazyArrays API. The syntax would be a bit nicer (e.g., y .= @~ α * A * x + β * y to invoke mul!).

@antoine-levitt
Copy link
Contributor

The problem is that Optim expects P to define ldiv!, which LinearMaps don't do. This whole thing would be so much simpler if people used the sane convention that P is an approximation to the inverse of the hessian, instead of the reverse. But I guess that war is lost by now :-p

@tkf
Copy link
Contributor Author

tkf commented Sep 20, 2019

I'm not sure if I follow. I'm just an "end-user" when it comes to the actual optimization algorithms. But do you mean that you prefer to directly define the linear map that does the multiplication by the inverse of the hessian?

@antoine-levitt
Copy link
Contributor

Yeah, that's usually what happens for large scale systems (where ldiv! could be, eg, a multigrid algorithm or something). Also you never have to apply P, just inv(P). The same problem pops up in iterative solvers for linear systems or for eigenvalue problems. That's not covered well by the usual packages (eg LinearMaps), so what I usually do is define a custom object that implements ldiv! (and maybe a few others, as needed by the algorithm).

@tkf
Copy link
Contributor Author

tkf commented Sep 20, 2019

I see. Then LazyArray may be useful here. You can define a linear map f, wrap it into the "lazy inverse" as P = @~ inv(f) (which does not evaluate anything), and then Optim calls y .= @~ P \ x to actually run mul! of f. (Although I guess that's equivalent to what you already do.)

@antoine-levitt
Copy link
Contributor

I think it's better for Optim to remain agnostic to what P actually is, and not use LazyArrays. I'm fine with the current situation, it's just a bit unfortunate that the community has decided on P being an approximation to A instead of an approximation to A^-1.

@tkf
Copy link
Contributor Author

tkf commented Sep 20, 2019

Well, I'm personally hoping LazyArrays to be as universal as broadcasting :)

@timholy
Copy link
Contributor

timholy commented Sep 20, 2019

A reminder that Compat.jl is an officially-sanctioned mechanism for "backporting" features from newer Julia versions to older ones (even if it's piracy), and one that allows other packages to take advantage of the features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants