Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: ModuleNotFoundError: No module named 'bitsandbytes' #5503

Closed
emillykkejensen opened this issue Jun 13, 2024 · 8 comments · Fixed by #8792
Closed

[Bug]: ModuleNotFoundError: No module named 'bitsandbytes' #5503

emillykkejensen opened this issue Jun 13, 2024 · 8 comments · Fixed by #8792
Labels
bug Something isn't working

Comments

@emillykkejensen
Copy link

Your current environment

Using Docker!

🐛 Describe the bug

Running v0.5.0 docker image with bitsandbytes quantization gives me the follwoing error:

[rank0]: Traceback (most recent call last):
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/quantization/bitsandbytes.py", line 83, in __init__
[rank0]:     import bitsandbytes
[rank0]: ModuleNotFoundError: No module named 'bitsandbytes'

[rank0]: The above exception was the direct cause of the following exception:

[rank0]: Traceback (most recent call last):
[rank0]:   File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
[rank0]:     return _run_code(code, main_globals, None,
[rank0]:   File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
[rank0]:     exec(code, run_globals)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/api_server.py", line 196, in <module>
[rank0]:     engine = AsyncLLMEngine.from_engine_args(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 395, in from_engine_args
[rank0]:     engine = cls(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 349, in __init__
[rank0]:     self.engine = self._init_engine(*args, **kwargs)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 470, in _init_engine
[rank0]:     return engine_class(*args, **kwargs)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 223, in __init__
[rank0]:     self.model_executor = executor_class(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/executor_base.py", line 41, in __init__
[rank0]:     self._init_executor()
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/gpu_executor.py", line 24, in _init_executor
[rank0]:     self.driver_worker.load_model()
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 121, in load_model
[rank0]:     self.model_runner.load_model()
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 147, in load_model
[rank0]:     self.model = get_model(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/__init__.py", line 21, in get_model
[rank0]:     return loader.load_model(model_config=model_config,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/loader.py", line 775, in load_model
[rank0]:     model = _initialize_model(model_config, self.load_config,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/loader.py", line 97, in _initialize_model
[rank0]:     return model_class(config=model_config.hf_config,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/llama.py", line 340, in __init__
[rank0]:     self.model = LlamaModel(config,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/llama.py", line 262, in __init__
[rank0]:     self.layers = nn.ModuleList([
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/llama.py", line 263, in <listcomp>
[rank0]:     LlamaDecoderLayer(config=config,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/llama.py", line 188, in __init__
[rank0]:     self.self_attn = LlamaAttention(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/llama.py", line 122, in __init__
[rank0]:     self.qkv_proj = QKVParallelLinear(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/linear.py", line 540, in __init__
[rank0]:     super().__init__(input_size=input_size,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/linear.py", line 233, in __init__
[rank0]:     super().__init__(input_size, output_size, skip_bias_add, params_dtype,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/linear.py", line 147, in __init__
[rank0]:     self.quant_method = quant_config.get_quant_method(self)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/quantization/bitsandbytes.py", line 67, in get_quant_method
[rank0]:     return BitsAndBytesLinearMethod(self)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/quantization/bitsandbytes.py", line 88, in __init__
[rank0]:     raise ImportError("Please install bitsandbytes>=0.42.0 via "
[rank0]: ImportError: Please install bitsandbytes>=0.42.0 via `pip install bitsandbytes>=0.42.0` to use bitsandbytes quantizer.
@emillykkejensen emillykkejensen added the bug Something isn't working label Jun 13, 2024
@jeejeelee
Copy link
Contributor

This is a feature, if you want to use bitsandbytes in VLLM, you must install bitsandbytes yourself firstly

@emillykkejensen
Copy link
Author

Okay, thanks for the clarification. Any preferred way of adding feature dependencies to the vLLM image during run?

@jeejeelee
Copy link
Contributor

If i undestand correctly, perhaps you can try:

docker exec -ti container_id /bin/bash 

After entering the container, run:

pip install bitsandbytes>=0.42.0

BTY, If you build image using dockerfile ,you can add this dependency in dockerfile

@simon-mo
Copy link
Collaborator

Please send a PR to include it in the Dockerfile so it can work out of the box.

@philjak
Copy link

philjak commented Sep 4, 2024

This is a feature, if you want to use bitsandbytes in VLLM, you must install bitsandbytes yourself firstly

If I want to spawn a container with a given bitsandbytes model, the container will crash because of this missing dependency. +1 for having this right in the image.

@tvvignesh
Copy link

I am facing the same issue as well:

image

@fullstackwebdev
Copy link

annoying, lol

@tvvignesh
Copy link

I had to write a dockerfile like this and it worked

FROM vllm/vllm-openai:latest

RUN pip install bitsandbytes>=0.42.0

ENTRYPOINT ["python3", "-m", "vllm.entrypoints.openai.api_server"]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants