Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "lean" argument to feols() and fepois() #547

Closed
mcsqr opened this issue Jul 12, 2024 · 4 comments · Fixed by #548
Closed

Add "lean" argument to feols() and fepois() #547

mcsqr opened this issue Jul 12, 2024 · 4 comments · Fixed by #548
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@mcsqr
Copy link

mcsqr commented Jul 12, 2024

(unless I somehow missed this) fixest has a lean parameter to strip the result of large components eating a lot of memory. It's really useful for large computations, where using pyfixest results in oom errors.

@s3alfisc
Copy link
Member

s3alfisc commented Jul 12, 2024

Good that you mention this, it is bothering me as well 😀 note that you can already economize a bit by setting 'store_date=False' and 'copy_data=False', though this still saves a range of large-memory attributes.

I suppose an argument 'lean=True' should also drop the Y and X attributes and garbage collection memory.

Should be an easy enough addition =)

@s3alfisc s3alfisc changed the title "Lean" option not supported Add "lean" argument to feols() and feöo Jul 12, 2024
@s3alfisc s3alfisc changed the title Add "lean" argument to feols() and feöo Add "lean" argument to feols() and fepois() Jul 12, 2024
@s3alfisc s3alfisc added enhancement New feature or request good first issue Good for newcomers labels Jul 12, 2024
@s3alfisc
Copy link
Member

s3alfisc commented Jul 12, 2024

Context

Quite a lot of large objects are stored in Feols / Feiv / Fepois objects. To avoid out-of-memory errors when working with big data sets, we want to add a function argument lean to all three classes mentioned above and the feols() and fepois() APIs.

Task

Add the end of the run_all_models method of the FixestMulti class, if lean = True, set the following attributes to None:

  • self._X
  • self._Y
  • self._Z
  • self._cluster_df
  • self._data

In code, do something as

import gc
if lean: 
   del self._X
   del self._Y
   del self._Z
   del self._cluster_df
   del self._data
   gc.collect()

@mcsqr
Copy link
Author

mcsqr commented Jul 12, 2024

👏

@s3alfisc
Copy link
Member

Done 👍

You can now specify a lean function argument:

%load_ext autoreload
%autoreload 2

import pyfixest as pf 
data = pf.get_data()
fit = pf.feols("Y ~ X1", data = data, lean = True)
hasattr(fit, "_X")
# False

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
Development

Successfully merging a pull request may close this issue.

2 participants