Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decomposition and factorization terminology #26995

Closed
Sacha0 opened this issue May 5, 2018 · 19 comments
Closed

decomposition and factorization terminology #26995

Sacha0 opened this issue May 5, 2018 · 19 comments
Labels
linear algebra Linear algebra

Comments

@Sacha0
Copy link
Member

Sacha0 commented May 5, 2018

Singular value decomposition factorization (svdfact) is a slightly unfortunate name; minor a thing as it is, the redundancy chafes. Perhaps svdecomp, svfact, or something similar would be better? Best!

@Sacha0 Sacha0 added the linear algebra Linear algebra label May 5, 2018
@antoine-levitt
Copy link
Contributor

The usual acronyms are LU, QR, SVD. Unfortunate perhaps but well established, so it would be better to have consistency across these factorization rather than consistency of the name itself. Reading svdecomp for instance makes me wonder what an ecomp is.

@ViralBShah
Copy link
Member

It never occurred to me, but now that you mention it, it does sound weird. However, I don't ever expand the SVD in my mind when I think about it, and perhaps it is ok to let it be as it is.

@Sacha0
Copy link
Member Author

Sacha0 commented May 13, 2018

A broader thought in the same vein: At the moment we mix the terms decomposition and factorization (e.g. Computes the eigenvalue decomposition of A, returning an Eigen factorization object F [...]). Decomposition is the more common and slightly more general term. Perhaps we should use decomposition consistently? Best!

@ViralBShah
Copy link
Member

I believe there was a discussion thread on which name to use, and we picked factorizations early on. We should dig up and link the original discussion at the very least.

@Sacha0 Sacha0 changed the title redundancy in svdfact redundancy decomposition and factorization terminology May 14, 2018
@Sacha0
Copy link
Member Author

Sacha0 commented May 14, 2018

A bit of git spelunking revealed the following history: Miles Lubin introduced the name lufact confined to an UMFPackLU wrapper via 53c2d30. Later, Doug Bates introduced *d names, e.g. lud and qrd, for decompositions broadly via #1281 and #1290. In #1281, Viral pointed out that the *d names were opaque, and suggested extending the UMFPackLU lufact to *fact/ "factorization" generally instead. Shortly thereafter, Tim Holy suggested *dcmp/"decomposition" as an alternative, and Doug Bates expressed remorse for introduction of "factorization" terminology and a preference for *decomp/"decomposition", but said he'd go with either decision. Viral responded saying he would be happy with either name and left the call to Doug, though he liked *fact's brevity. Mike shared a little support for "decomposition" then, and Doug likewise. That's where the conversation appears to leave off. Viral later committed 69e407b, renaming the *d functions to *fact, and here we find ourselves :).

Out of curiosity, I checked the number of google hits for "X decomposition" and "X factorization", and while I had the impression that decomposition was the more widespread term, the degree to which that appears true surprised me; results in millions below:

Update regarding the table below: These hit counts were for unquoted search queries, whereas quoted search queries are probably a better metric. With quoted queries, which term hit counts favor depends on the decomposition, and the results are much less compelling overall. Ref. #26995 (comment).

X decomp fact
lu 15.3 1.07
qr 12.3 0.34
singular value 2.58 0.68
eigen 0.41 0.08
cholesky 0.4 0.2
schur 0.28 0.43

Tangentially, the history suggests that the only reason for the *fact/*d names was to retain MATLAB compatibility in lu, qr, and friends. But with the MATLAB-like functions lu, qr, et al now being deprecated in favor of *fact, deprecating the *fact names to lu, qr, et al becomes possible in 1.x (discussed briefly in #25187). Best!

@ViralBShah
Copy link
Member

Thank you for that detailed analysis!

@StefanKarpinski
Copy link
Member

Yes, I'm very much in favor of making breaking changes to LinearAlgebra 2.0 in some Julia 1.x release where the names are just lu, schur, chol, etc. but the objects returned are factorization objects. We can retain the ability to write code like L, U = lu(X) but defining iteration of the factorization objects to yield the expected components. Let's spend the intervening time thinking about what the best design for this kind of API would be without any historical baggage.

@Sacha0
Copy link
Member Author

Sacha0 commented May 15, 2018

Yes, I'm very much in favor of making breaking changes to LinearAlgebra 2.0 in some Julia 1.x release where the names are just lu, schur, chol, etc. but the objects returned are factorization objects. We can retain the ability to write code like L, U = lu(X) but defining iteration of the factorization objects to yield the expected components.

Agreed! And likewise Andreas it seems. #26997 should at least set us up for those changes during 1.x, and potentially non-breaking then. Best!

@ViralBShah
Copy link
Member

I am not sure how these Google hits were computed, but I don't get anything above 150,000-ish on anything, and even so, no more than 13-14 pages of results.

@Sacha0
Copy link
Member Author

Sacha0 commented May 21, 2018

I am not sure how these Google hits were computed, but I don't get anything above 150,000-ish on anything, and even so, no more than 13-14 pages of results.

The difference is quoting versus not quoting the search query :).

@ViralBShah
Copy link
Member

ViralBShah commented May 21, 2018

I think one ought to quote it, which is what I thought you did since you did say "X decomposition" and "X factorization". Even if I do it without quotes for lu, I get the same numbers roughly, about 1.2-1.3M, and not 15.3M vs. 1M. I don't think Google hits are a reliable way to decide this.

@Sacha0
Copy link
Member Author

Sacha0 commented May 21, 2018

A slack conversation convinced me that the hit counts for quoted search queries are a better metric, and for such queries which term the hit count favors depends on the particular decomposition; in other words, ignore the table above, as it's probably not the best guide. The remaining question is a minor one of correctness, in that e.g. decomposition is in some cases perhaps more correct for eig than factorization, but whether that's worth bothering about 🤷‍♂️. Best!

@StefanKarpinski
Copy link
Member

I would point out that while the eigenvectors and eigenvalues are not a factorization as a pair—you can’t multiply them and get the original matrix back—the factorization object does act as a true factorization in that you can use it in place of the original matrix as “pre-factorized” stand in. Moreover, you can get one of these objects through a funcrion called, yes, factorize, not decompose.

@Sacha0
Copy link
Member Author

Sacha0 commented May 21, 2018

A little further slack triage settled on the status quo, i.e. retaining factorize/Factorization (and I imagine consequently continuing to use somewhat mixed decomposition/factorization terminology). Best!

@o314
Copy link
Contributor

o314 commented May 22, 2018

First is imho, i have spent thousand of hours working on math, engineering and ontology, terminology call it like you want.

Factorization is grounded into arithmetics.
when the matrix field is numerical, frequently factorization pops here and there.

Composition is more compatible with the evolution toward symbolic programming.

it's a natural movement found when trying to solve equation (math work) or assemblying things (engineering work). like with dynamic programming, we break a problem in multiple piece and with the property of the zero element / absorbing element of a groupw we solve the whole constraining either one part or the other thanks to the law of excluded middle


secondly, some facts

wolfram is matrix decomposition everywhere.

and google scholar too when you go to symbolic computing

search hits
"category decomposition" 278
"category factorization" 27
"graph decomposition" 7880
"graph factorization" 700
"ideal decomposition" 1170
"ideal factorization" 795
"lattice decomposition" 731
"lattice factorization" 261
"monad composition" 108
"monad decomposition" 7
"monad factorization" 1

bonus guess how to check the pantelides thing


IMHO julia in the large is better served with decomposition than factorization.
in linear algebra, factorization may however remain more commons.

@andreasnoack
Copy link
Member

andreasnoack commented Jun 5, 2018

Can this be closed now that the factorization functions no longer have fact in their names or would people like to discuss this topic further?

@fredrikekre
Copy link
Member

I guess what's left is to decide if we should rename Factorization to Decomposition.

@andreasnoack
Copy link
Member

andreasnoack commented Jun 5, 2018

I see. Even if decomposition was slightly better than factorization (which I don't think) then it's not worth the name change.

@StefanKarpinski
Copy link
Member

After extensive discussion we concluded that the two words are used roughly as frequently when talking about matrices but that "factorization" is much more matrix-specific and thus conveys more information. It's also the one we're already using and it's no longer very user-facing, so we do nada.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

No branches or pull requests

7 participants