-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leaks / is better memory management possible? #4239
Comments
cc @shiyu1994 |
The memory leakage happens from version 3.0.0. Now I can only identify the leakage is from |
memory leak with LGBMRegressor(device_type='cuda'). import gc
import numpy as np
import pandas as pd
from sklearn import datasets
from lightgbm import LGBMRegressor
def mem(msg=""):
""" Memory usage in MB """
with open("/proc/self/status") as f:
memusage = f.read().split("VmRSS:")[1].split("\n")[0][:-3]
print(msg, "- memory:", np.round(float(memusage.strip()) / 1024.0), "MB")
mem("Start")
X, y = datasets.make_regression(
n_samples=1000,
n_features=1000,
n_informative=5,
random_state=0,
)
mem("Created data frame")
for i in range(50000):
gbm = LGBMRegressor(device_type='cuda')
gbm.fit(X, y)
del gbm
gc.collect()
mem(f"Iteration #{i}")
gc.collect()
mem("End of script") outputs:
|
@shiyu1994 any updates for this issue? |
Are there any solutions/workarounds to this problem? |
Just to add a small datapoint, I also experience this issue when trying to fit the LGBMClassifier and device type cuda. I also attempted deleting the model similar to @ravehun with no improvement. I do not seem to have the memory leak when using the cpu, but would greatly prefer to use the gpu if this issue can be resolved. |
I strongly suspect that this has been fixed by changes to LightGBM, its dependencies, or Python in the 3 years since it was first reported. I ran the following today on an M2 mac (so docker run \
--rm \
-it python:3.11 \
bash
pip install 'lightgbm==4.3.0' 'pandas>=2.2.2' 'scikit-learn>=1.4.2' Ran a slightly-modified version of the original script provided for this issue (just added check-lgb.py (click me)cat << EOF > check-lgb.py
import gc
import numpy as np
import pandas as pd
from sklearn import datasets
from lightgbm import LGBMRegressor
def mem(msg=""):
""" Memory usage in MB """
with open("/proc/self/status") as f:
memusage = f.read().split("VmRSS:")[1].split("\n")[0][:-3]
print(msg, "- memory:", np.round(float(memusage.strip()) / 1024.0), "MB")
mem("Start")
X, y = datasets.make_regression(
n_samples=100000,
n_features=1000,
n_informative=5,
random_state=0,
)
mem("Created data frame")
for i in range(20):
gbm = LGBMRegressor(verbose=-1)
gbm.fit(X, y)
del gbm
gc.collect()
mem(f"Iteration #{i}")
del X
del y
gc.collect()
mem("End of script")
EOF python ./check-lgb.py I don't see evidence of a memory leak.
Some other things that make me think this may have been fixed in recent versions of LightGBM.
Anyone reporting that this is "still" a problem, please provide a reproducible example using a recent ( I'm adding the label |
Thank you @jameslamb and whole LightGBM team! |
Description
I'm working on AutoML package. My users observed increased memory usage (mljar/mljar-supervised#381) so I started to dig.
I found that LightGBM consumes a lot of RAM and it doesn't release it, even if the model is deleted.
Reproducible example
Output:
Environment info
LightGBM version 3.2.1
Python 3.8.5
OS Ubuntu 20.04
Additional Comments
I love using LightGBM because of its speed. It is much faster than other GBMs especially on multiclass classification tasks with many classes > 50.
The text was updated successfully, but these errors were encountered: