-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Debugging LinAlgError - any idea what is going on? #132
Comments
I think there are multiple ways that the data can become ill conditioned enough to throw an error like this and it is sometimes tricky to diagnose because it can be arising from the original data set itself or after being transformed by the weights. In my absence, it sometimes arises from small samples/bandwidths or a lack of variation within samples, or both. I suppose it would be possible to pull out the individual weighted sample for the particular local regression to try and diagnose things more formally, but implementing a fix as @ljwolf suggested in #116 may help stabilize even without understanding. |
Possibly but it would need to go beyond what @ljwolf suggested as this specific error comes from |
@TaylorOshan do you have any idea how to formulate an error message that is generic enough but give a user a bit more idea what has happened that current |
well taylor may disagree with me ;P but multicolinearity can happen in unforseen ways in gwr
the opaque but foreboding error is a reasonable way to let people know they need to think more about the model, imo :) |
To note,
Adding a small random value to the diagonal is a distinct fix from moving to a pseudo inverse strategy.
And, the pseudo inverse can indeed be swapped in for the solve call… anywhere where solve/inv is called, you can swap a pseudo inverse. Solving for xtx inverse xt is exactly the context linked in #116 in statsmodels.
Get Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
From: eli knaap ***@***.***>
Sent: Sunday, November 5, 2023 12:28:11 AM
To: pysal/mgwr ***@***.***>
Cc: Levi John Wolf ***@***.***>; Mention ***@***.***>
Subject: Re: [pysal/mgwr] Debugging LinAlgError - any idea what is going on? (Issue #132)
well taylor may disagree with me<https://link.springer.com/article/10.1007/s10109-016-0239-5> ;P but multicolinearity can happen in unforseen ways in gwr
* https://link.springer.com/article/10.1007/s10109-005-0155-6
* http://journals.sagepub.com/doi/10.1068/a38218
* https://link.springer.com/article/10.1007/s10109-014-0199-6
the opaque but foreboding error is a reasonable way to let people know they need to think more about the model, imo :)
—
Reply to this email directly, view it on GitHub<#132 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AARFR465GVK663JOTGEHQC3YC3MRXAVCNFSM6AAAAAA63SJJ4SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJTGU4TCNRTG4>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
This is way too deep stats/maths for me 🙃. I'll make a PR with a custom error message and ask you to review its text but will leave the solution using pseudo inverse or anything else to someone more capable. |
Looking at this today, this is not an The global regression here is ill-posed: the X matrix is totally collinear, so it's not going to work as a global regression. >>> from spreg import OLS
>>> OLS(y,X) # raises LinAlgError The singularity is due to the >>> numpy.linalg.inv(X.T @ X) # raises singular Some argue you should try to avoid direct calls to >>> coefs, resids, rank, svs = numpy.linalg.lstsq(X, y)
>>> coefs
array([[0.00656886],
[0.01313772],
[0.01970658]])
>>> rank
1
>>> resids
array([], dtype=float64) Note that the >>> numpy.linalg.solve(X.T @ X, X.T) # raises LinAlgError Note that if we use the Tikhonov trick, we get the same betas, and no warning about the very small singular values: >>> ridge = numpy.eye(X.shape[1]) * 1e-5 # this is the "ridge" in a ridge regression
>>> tikhonov_betas = numpy.linalg.inv(X.T @ X + ridge) @ X.T @ y
>>> tikhonov_betas
array([[0.00656886],
[0.01313772],
[0.01970658]]) So, what's the fix here?
|
I've been occasionally hitting
LinAlgError
as reported in #94 or #116 and I wanted to better understand what is causing it and when does it happen. And this is one of the toy examples I came up with where I am able to reproduce it but have no idea why.Then using this very specific bandwidth, I get the singular matrix error
But changing it even slightly to another value, larger (12.392) or smaller (12.390), makes it work again. But I am just not able to figure out where this number comes from. My first idea is that it is some specific pairwise distance but it is not.
The requirement for this to happen is collinearity within
X
but why does it happen for this specific bandwidth is unclear to me. Anyone has an idea?I started digging into that to either fix it as @ljwolf suggested in #116 or to at least provide an informative error message but given I am not sure what is exactly going on I don't even know how to formulate the error.
The text was updated successfully, but these errors were encountered: