Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checkpoint can't be loaded on MacOS with M3 #6422

Closed
guxiaobo opened this issue Apr 18, 2024 · 12 comments
Closed

checkpoint can't be loaded on MacOS with M3 #6422

guxiaobo opened this issue Apr 18, 2024 · 12 comments
Labels

Comments

@guxiaobo
Copy link

guxiaobo commented Apr 18, 2024

Description

I use ray_tune with lightgbm and got a demo best model checkpoint, which is saved as a model.txt file,
but trying to load the model.txt checkpoint caused python to crash.

Reproducible example

The steps and source as listed here ray-project/ray#44566

Environment info

LightGBM version or commit hash:

I am using lightgbm python package with ray-2.10.0

Ray 2.10.0 on MacOS 14.4.1 with M3 chip, python 3.11.7

@jameslamb
Copy link
Collaborator

Thanks for using LightGBM.

What version of lightgbm do you have and how did you install it?

@guxiaobo
Copy link
Author

I installed the lightgbm-ray 0.1.9 python package which installed lightgbm 4.3.0.

@jameslamb
Copy link
Collaborator

Can you please share the model file that you mentioned here: ray-project/ray#44566 (comment)

@guxiaobo
Copy link
Author

model.txt

@jameslamb
Copy link
Collaborator

jameslamb commented Apr 19, 2024

Thanks. We're going to need more information to help you with this.

In ray-project/ray#44566 (comment), you claimed that code like this...

from lightgbm import Booster
booster = Booster(model_file="model.txt")

... "fails". I ran that today on my M2 Mac, with the model file you provided, using Python 3.11 and lightgbm 4.3.0, and did not see it fail.

conda create \
    -c conda-forge \
    --name lgb-test \
        python=3.11 lightgbm=4.3.0

source activate lgb-test

Can you please share the specific commands you used to install lightgbm? e.g. pip install, conda install, something else?

Can you try installing the conda package from conda-forge as I showed above and tell us if it fixes the issue?

@guxiaobo
Copy link
Author

guxiaobo commented Apr 19, 2024

pip install -U lightgbm-ray

@jameslamb
Copy link
Collaborator

jameslamb commented Apr 19, 2024

Ok, are you willing to try conda as in my example above?

If not and you want to stay with pip, can you please try replacing your installation of lightgbm with one that doesn't use OpenMP? We have some known issues with OpenMP support on macOS that are still a work in progress (e.g. #4229).

pip uninstall --yes lightgbm
pip install \
    --no-binary lightgbm \
    --no-cache \
    --config-settings=cmake.define.USE_OPENMP=OFF \
    'lightgbm>=4.3.0'

Please let me know what happens if you try either of those.

Please provide as much information as possible, and if you have not seen it before please read https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax to help with formatting messages to make the difference between code, text produced by code, and your own words clearer.

@guxiaobo
Copy link
Author

(base) guxiaobo@guxiaobodebijibendiannao Downloads % python
Python 3.11.7 (main, Dec 15 2023, 12:09:56) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.

from lightgbm import Booster
booster = Booster(model_file="model.txt")
zsh: segmentation fault python
(base) guxiaobo@guxiaobodebijibendiannao Downloads %

Python 3.11.8 works fine, it seems a Python version related problem.

@jameslamb
Copy link
Collaborator

I asked you to try 2 different installation methods for lightgbm... is this message you just posted the result of trying one of those? If so... which one? And did you try the other one?

We're happy to try to help, but please answer our direct questions.

@guxiaobo
Copy link
Author

Hi James, I am sorry , I did not notice your second method post before my last post.
I have tried both methods for both python 3.11.7 and python 3.11.8, these cases all work well.
And the reinstalled lightgbm under my ray env under which I run "pip install -U lightgbm-ray" works well too.

@guxiaobo
Copy link
Author

(base) guxiaobo@guxiaobodebijibendiannao Downloads % python Python 3.11.7 (main, Dec 15 2023, 12:09:56) [Clang 14.0.6 ] on darwin Type "help", "copyright", "credits" or "license" for more information.

from lightgbm import Booster
booster = Booster(model_file="model.txt")
zsh: segmentation fault python
(base) guxiaobo@guxiaobodebijibendiannao Downloads %

Python 3.11.8 works fine, it seems a Python version related problem.
This output is in the ray env which lightgbm is installed by "pip install -U lightgbm-ray"

@jameslamb
Copy link
Collaborator

No problem!

all works well

Ok great! Then yes I strongly suspect this is related to OpenMP compatibility issues. Until you see that #4229 is closed, I recommend using the lightgbm package from conda-forge on the M3 Mac.

Very sorry for the difficulty. macOS packaging is an area I'm focusing on for the next release of LightGBM, hopefully this will get better soon. You can subscribe to #4229 to be notified of changes.

Since it seems like we've identified the root cause and found a fix, I'm going to close this. Please comment if you have any other concerns.

@jameslamb jameslamb added bug and removed question labels Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants