Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lsqr() for fixed effects solver #546

Merged
merged 20 commits into from
Jul 21, 2024
Merged

Conversation

greenguy33
Copy link
Contributor

Closes #469

@greenguy33
Copy link
Contributor Author

Looks like method _matmul_dispatch() in scipy.sparse is throwing an error based on this change:

        if issparse(other):
            if self.shape[-1] != other.shape[0]:
>               raise ValueError('dimension mismatch')
E               ValueError: dimension mismatch

../../../.cache/pypoetry/virtualenvs/pyfixest-va_uSH9_-py3.9/lib/python3.9/site-packages/scipy/sparse/_base.py:603: ValueError

I'll take a look at what's going asap.

@s3alfisc
Copy link
Member

I think lsqr might return a tuple? https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.lsqr.html so we'd have to pick the first return element.

`uhat` as dense 1D array
@s3alfisc
Copy link
Member

Another option to fix the failing tests is to tighten the convergence criteria for the iterative lsqr solver. I.e. by setting atol=1e-10 and btol=1e-10, all tests pass:

        alpha = lsqr(D2, uhat, atol = 1e-10, btol = 1e-10)[0]

Note that scikit-learns LinearRegession class also builds on the lsqr solver, and they are satisfied with the default tolerances for atol and btol of 1e-06.

I think I would suggest the following: we keep the 1e-06 defaults as our defaults, but add function arguments atol and btol to the .predict() and .fixef() methods so that users can specify their desired numerical accuracy.

What do you think Hayden (@greenguy33)? Any feedback from a user's perspective on this, @b-knight?

@greenguy33
Copy link
Contributor Author

That sounds good to me. I am mostly offline this weekend but can definitely make this change in the coming week. I guess it will require some changes in the tests as well, to supply the passing values of 1e-10?

Copy link

codecov bot commented Jul 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Files Coverage Δ
pyfixest/estimation/feols_.py 90.59% <100.00%> (ø)
tests/test_did.py 100.00% <100.00%> (ø)
tests/test_predict_resid_fixef.py 92.80% <100.00%> (+0.49%) ⬆️

... and 26 files with indirect coverage changes

@greenguy33
Copy link
Contributor Author

greenguy33 commented Jul 19, 2024

@s3alfisc Looks like tests are passing now. However in order to get that to happen I changed the tolerance values for np.testing.assert_allclose in two test files. I'm not sure if these tests were passing before because it doesn't appear that they are running the lsqr solver.
Supplying tolerance values of atol = 1e-05, rtol = 1e-05 seemed to work, and smaller values tend to fail

@@ -1358,7 +1358,7 @@ def ccv(
n_splits=n_splits,
)

def fixef(self) -> dict[str, dict[str, float]]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add docstrings that explain atol and btol and that refer to the scipy lsqr docs?

https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.lsqr.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I've seen you did it already for predict. Thanks!

@s3alfisc s3alfisc merged commit 3d2e4d0 into py-econometrics:master Jul 21, 2024
7 checks passed
@s3alfisc
Copy link
Member

Just merged, thank you @greenguy33!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fixef: Implement Iterative Least Squares Solver
2 participants