Skip to content

cond (and normestinv) return not consistent values #301

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rfourquet opened this issue Jan 24, 2016 · 24 comments · Fixed by JuliaLang/julia#15113
Closed

cond (and normestinv) return not consistent values #301

rfourquet opened this issue Jan 24, 2016 · 24 comments · Fixed by JuliaLang/julia#15113
Labels
sparse Sparse arrays

Comments

@rfourquet
Copy link
Member

Sorry for the bad phrasing of the title.
The following reproduces a test case at line 1102 test/sparsedir/sparse.jl, which fails with PR JuliaLang/julia#14772:

julia> srand(5)
       Ac = sprandn(20,20,.5) + im* sprandn(20, 20,.5)
       cond(Ac,1) - cond(full(Ac),1)
30.262691632830524

julia> length(Set([Base.SparseArrays.normestinv(Ac) for x in 1:50]))
3
@andreasnoack
Copy link
Member

@mfasi What is known about the magnitude of the error? This one seems a bit large, but it matched the result I see in MATLAB with the same matrix.

@rfourquet
Copy link
Member Author

The test was commented out in JuliaLang/julia#14772, it should be re-enabled when this issue is fixed.

@KristofferC
Copy link
Member

FWIW I had a similar (same) problem here half a year ago: JuliaLang/julia#12734 (comment). It failed when I added a test above it so I guess it is sensitive to the random seed.

@jiahao
Copy link
Member

jiahao commented Feb 16, 2016

Generally condition number computations can be very imprecise, being loose upper bounds on the true condition number that are easy to compute rather than a precise estimate.

However, in this case, it looks like the discrepancy is in the computation returned by LAPACK.gecon!.

julia> cond(full(Ac), 1) #Dense case, uses the LU factorization cond(lufact(full(Ac)), 1), calling LAPACK.gecon!
100.3697356478622

julia> norm(full(Ac), 1)
19.86009075965159

julia> norm(inv(full(Ac)), 1)
6.577634959558674

julia> norm(full(Ac), 1) * norm(inv(full(Ac)), 1) #!= cond(full(Ac), 1) !
130.6324272806925

julia> cond(Ac, 1) #Essentially the same as the preceding
130.63242728069272

julia> SparseMatrix.normestinv(Ac) #Estimate is essentially identical to the exact answer
6.5776349595586865

julia> norm(Ac, 1) #Essentially identical to dense norm
19.86009075965159

julia> SparseMatrix.normestinv(Ac) * norm(Ac, 1) # = cond(Ac, 1)
130.63242728069272

The norm of A passed into LAPACK.gecon! is correct, being norm((A[:L] * A[:U])[A[:p],:],p) = 19.86009075965159.

@andreasnoack thoughts?

@andreasnoack
Copy link
Member

Interesting. According to the LAPACK manual, xGECON is based on

N. J. HIGHAM, FORTRAN codes for estimating the one-norm of a real or complex matrix, with applications to condition estimation, ACM Trans. Math. Softw., 14 (1988), pp. 381-396.

which reports that the procedure is doing well for random matrices, but badly for some non-random matrices. In Applied Numerical Linear Algebra, Demmel writes that a good estimate of norm(inv(A),1) "...should be within a factor of ten", since "This is all one needs for an error bound which tells you about how many decimal digits of accuracy that you have..." so we should probably just adjust the test.

@jiahao
Copy link
Member

jiahao commented Feb 16, 2016

It looks like xGECON is known to predict arbitrarily bad underestimates of the condition number for some pathological matrices (pp. 385-6 of (pdf)). So AFAICT the underestimation of the condition number is not unexpected and there's no reason to suspect that any of the algorithms are broken.

@andreasnoack
Copy link
Member

True, but this matrix doesn't seem to be similar to the problematic matrices in the paper. The matrix in this issue is random and has a small condition number. I agree that the algorithms are probably not wrong so adjusting the tests and updating the documentation is probably the right solution.

Apparently, it is conjectured that no bounds exist for cheap estimates of the condition number, but that in "most cases" it is within a factor of ten.

@jiahao
Copy link
Member

jiahao commented Feb 16, 2016

All I meant to say is that I was wrong in thinking that condition number estimates always overestimate.

Perhaps we could solicit feedback from @higham to be sure that our understanding so far is correct.

@higham
Copy link

higham commented Feb 16, 2016

I don't see anything of concern above.
The LAPACK estimator always produces a lower bound, modulo roundoff.
It's the ratio cond(Ac,1)/cond(full(Ac),1) that is relevant (but not reported above), and it is typically less than about 3.

@jiahao
Copy link
Member

jiahao commented Feb 16, 2016

Thanks @higham for the comment, much appreciated.

andreasnoack referenced this issue in JuliaLang/julia Feb 17, 2016
andreasnoack referenced this issue in JuliaLang/julia Feb 17, 2016
andreasnoack referenced this issue in JuliaLang/julia Feb 17, 2016
andreasnoack referenced this issue in JuliaLang/julia Feb 18, 2016
@MikaelSlevinsky
Copy link

Maybe the fact that the Julia implementations of cond(A,1) and cond(A,Inf) are lower estimates should be in the documentation? I just ran into this by trying:

julia> H(w) = I-2/dot(w,w)*w*w'
H (generic function with 1 method)

julia> w = linspace(0,1,5)
5-element LinSpace{Float64}:
 0.0,0.25,0.5,0.75,1.0

julia> cond(H(w),1)
2.7600000000000002

julia> norm(H(w),1)*norm(inv(H(w)),1)
3.24

Note that H(w) isn't even particularly large nor ill-conditioned (it's a Householder matrix), so the discrepancy was a surprise.

@higham
Copy link

higham commented Sep 28, 2016

Yes - I think it should be mentioned. The docs for cond I just found at

http://docs.julialang.org/en/release-0.5/stdlib/linalg/?highlight=cond

say "Condition number of the matrix M, computed using the operator
p-norm. Valid values for p are 1, 2 (default), or Inf.", which is
misleading if, in fact, only an estimate is computed.

@MikaelSlevinsky
Copy link

Now that I've had a while to think about this more, I think the current method of calculating 1- and inf-norm condition numbers should be in a different function like the Matlab condest, say.

The reasoning is that it's imprudent to return a lower estimate to a least upper bound. On the other hand, if the matrix is ill-conditioned then implementing cond via its definition as a product of norms will give an upper estimate.

@andreasnoack
Copy link
Member

My training is not in numerical analysis but more in statistics so I'm not sure of the exact meaning of "estimate" in NA. Golub & Van Loan don't seem to define it (according to the index). When you are used to statistics, it is tempting to think of all these condition number functions as estimators with different properties. Some have a larger variance than others but it's not that norm(A,1)*norm(inv(A),1) doesn't have an error. In any case, we should state clearly in the documentation the method used and what we know about its properties.

@higham
Copy link

higham commented Oct 3, 2016

in numerical analysis an estimate means not exact in exact arithmetic. norm(A,1)*norm(inv(A),1) is the exact 1-norm. The inexactness when this quantity is evaluated in floating point would not normally cause us to describe it as an estimate.

@andreasnoack
Copy link
Member

Thanks for the clarification. The question is if we really want to have two different functions for computing the condition number. How often would people like to use the expensive exact-in-exact version? My feeling right now is that we should only have a single (i.e. cond) and then clearly document what exactly is computed for each input.

@MikaelSlevinsky
Copy link

Sorry, my argument should have not have used the word estimate, and is applicable in exact and inexact arithmetic:

It's imprudent to return a lower bound to a least upper bound. On the other hand, if the matrix is ill-conditioned then implementing cond(A,p) = norm(A,p)*norm(inv(A),p) calculated in inexact arithmetic will be bounded below by the condition number calculated in exact arithmetic.

Surely, a lower bound within a factor of three or even 10 is excellent (and perhaps far closer to the condition number than returned by norm(A,p)*norm(inv(A),p)), but it's not calculating the condition number of the matrix in exact arithmetic, thus the current implementation should be a method of a different function, condest, say.

@MikaelSlevinsky
Copy link

I'd settle for updated documentation, but I also don't think it's so onerous to implement the two different functions cond and condest. Matlab, R, and Octave (packages if not native) do this. Perhaps @dlfivefifty has an opinion.

@higham
Copy link

higham commented Oct 3, 2016

"if the matrix is ill-conditioned then implementing cond(A,p) = norm(A,p)*norm(inv(A),p) calculated in inexact arithmetic will be bounded below by the condition number calculated in exact arithmetic." I'd be interested to see a proof of that; it's not clear to me.

@MikaelSlevinsky
Copy link

Ok, that's not true. (A counter-example is the family of Hilbert matrices, where numerical computations have 1-norm condition numbers plateau at roughly 10^18, but the exact condition numbers appear to be unbounded as the dimension n increases.)

But it doesn't change the surprising discrepancy in the low-dimensional well-conditioned example I showed earlier.

@StefanKarpinski
Copy link
Member

I also don't think it's so onerous to implement the two different functions cond and condest

The issue is not that we're too lazy to implement two different functions, but rather that having two different functions that do almost the same thing may be confusing to users. How do they know when to use which? Many systems solve the problem of having multiple choices by just presenting the user with all the choices, but that's lazy design. Perhaps there are good reasons to have both here, but we try not to have two functions when one is enough.

@tkelman
Copy link

tkelman commented Oct 4, 2016

condest in these other environments is documented to use an algorithm that's less expensive but less accurate. I think our combining them into the same function name is not appropriate.

xref #104 JuliaLang/julia#12467 JuliaLang/julia#1867

@andreasnoack
Copy link
Member

Generally (and as mentioned in the linked threads), I don't think we should have several functions for the same quantity. E.g. if we add support for a QR based symmetric eigensolver then I don't think we should have a separate function for the RRR based version we are using now even though it is less expensive and less accurate. We are not strict about this, e.g. with eig,eigs and maybe a similar argument can be made for lufact/qrfact but in those cases, we return different types and then it is necessary to have separate functions. That is not the case here, so I think the best API is to have a single function with a reasonable default, maybe with a keyword for choosing the algorithm, and then document it sufficiently.

@tkelman
Copy link

tkelman commented Oct 5, 2016

This isn't the same quantity though. It's a rough estimate of the quantity the documentation suggests it's calculating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sparse Sparse arrays
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants