[Bug]: ModuleNotFoundError: No module named 'bitsandbytes' #5503

emillykkejensen · 2024-06-13T12:14:30Z

Your current environment

Using Docker!

🐛 Describe the bug

Running v0.5.0 docker image with bitsandbytes quantization gives me the follwoing error:

[rank0]: Traceback (most recent call last):
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/quantization/bitsandbytes.py", line 83, in __init__
[rank0]:     import bitsandbytes
[rank0]: ModuleNotFoundError: No module named 'bitsandbytes'

[rank0]: The above exception was the direct cause of the following exception:

[rank0]: Traceback (most recent call last):
[rank0]:   File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
[rank0]:     return _run_code(code, main_globals, None,
[rank0]:   File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
[rank0]:     exec(code, run_globals)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/openai/api_server.py", line 196, in <module>
[rank0]:     engine = AsyncLLMEngine.from_engine_args(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 395, in from_engine_args
[rank0]:     engine = cls(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 349, in __init__
[rank0]:     self.engine = self._init_engine(*args, **kwargs)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 470, in _init_engine
[rank0]:     return engine_class(*args, **kwargs)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 223, in __init__
[rank0]:     self.model_executor = executor_class(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/executor_base.py", line 41, in __init__
[rank0]:     self._init_executor()
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/gpu_executor.py", line 24, in _init_executor
[rank0]:     self.driver_worker.load_model()
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 121, in load_model
[rank0]:     self.model_runner.load_model()
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 147, in load_model
[rank0]:     self.model = get_model(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/__init__.py", line 21, in get_model
[rank0]:     return loader.load_model(model_config=model_config,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/loader.py", line 775, in load_model
[rank0]:     model = _initialize_model(model_config, self.load_config,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/loader.py", line 97, in _initialize_model
[rank0]:     return model_class(config=model_config.hf_config,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/llama.py", line 340, in __init__
[rank0]:     self.model = LlamaModel(config,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/llama.py", line 262, in __init__
[rank0]:     self.layers = nn.ModuleList([
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/llama.py", line 263, in <listcomp>
[rank0]:     LlamaDecoderLayer(config=config,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/llama.py", line 188, in __init__
[rank0]:     self.self_attn = LlamaAttention(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/llama.py", line 122, in __init__
[rank0]:     self.qkv_proj = QKVParallelLinear(
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/linear.py", line 540, in __init__
[rank0]:     super().__init__(input_size=input_size,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/linear.py", line 233, in __init__
[rank0]:     super().__init__(input_size, output_size, skip_bias_add, params_dtype,
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/linear.py", line 147, in __init__
[rank0]:     self.quant_method = quant_config.get_quant_method(self)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/quantization/bitsandbytes.py", line 67, in get_quant_method
[rank0]:     return BitsAndBytesLinearMethod(self)
[rank0]:   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/quantization/bitsandbytes.py", line 88, in __init__
[rank0]:     raise ImportError("Please install bitsandbytes>=0.42.0 via "
[rank0]: ImportError: Please install bitsandbytes>=0.42.0 via `pip install bitsandbytes>=0.42.0` to use bitsandbytes quantizer.

The text was updated successfully, but these errors were encountered:

jeejeelee · 2024-06-13T12:21:57Z

This is a feature, if you want to use bitsandbytes in VLLM, you must install bitsandbytes yourself firstly

emillykkejensen · 2024-06-13T12:56:08Z

Okay, thanks for the clarification. Any preferred way of adding feature dependencies to the vLLM image during run?

jeejeelee · 2024-06-13T13:18:00Z

If i undestand correctly, perhaps you can try:

docker exec -ti container_id /bin/bash

After entering the container, run:

pip install bitsandbytes>=0.42.0

BTY, If you build image using dockerfile ,you can add this dependency in dockerfile

simon-mo · 2024-06-13T19:31:17Z

Please send a PR to include it in the Dockerfile so it can work out of the box.

philjak · 2024-09-04T16:55:30Z

This is a feature, if you want to use bitsandbytes in VLLM, you must install bitsandbytes yourself firstly

If I want to spawn a container with a given bitsandbytes model, the container will crash because of this missing dependency. +1 for having this right in the image.

tvvignesh · 2024-09-15T12:42:37Z

I am facing the same issue as well:

fullstackwebdev · 2024-09-18T00:55:20Z

annoying, lol

tvvignesh · 2024-09-18T04:36:34Z

I had to write a dockerfile like this and it worked

FROM vllm/vllm-openai:latest

RUN pip install bitsandbytes>=0.42.0

ENTRYPOINT ["python3", "-m", "vllm.entrypoints.openai.api_server"]

emillykkejensen added the bug Something isn't working label Jun 13, 2024

dipatidar mentioned this issue Jul 17, 2024

added bitsandbytes dependency in common requirement.txt file #6525

Open

jeejeelee mentioned this issue Sep 25, 2024

[[Misc]] Add extra deps for openai server image #8792

Merged

youkaichao closed this as completed in #8792 Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: ModuleNotFoundError: No module named 'bitsandbytes' #5503

[Bug]: ModuleNotFoundError: No module named 'bitsandbytes' #5503

emillykkejensen commented Jun 13, 2024

jeejeelee commented Jun 13, 2024

emillykkejensen commented Jun 13, 2024

jeejeelee commented Jun 13, 2024

simon-mo commented Jun 13, 2024

philjak commented Sep 4, 2024

tvvignesh commented Sep 15, 2024

fullstackwebdev commented Sep 18, 2024

tvvignesh commented Sep 18, 2024

[Bug]: ModuleNotFoundError: No module named 'bitsandbytes' #5503

[Bug]: ModuleNotFoundError: No module named 'bitsandbytes' #5503

Comments

emillykkejensen commented Jun 13, 2024

Your current environment

🐛 Describe the bug

jeejeelee commented Jun 13, 2024

emillykkejensen commented Jun 13, 2024

jeejeelee commented Jun 13, 2024

simon-mo commented Jun 13, 2024

philjak commented Sep 4, 2024

tvvignesh commented Sep 15, 2024

fullstackwebdev commented Sep 18, 2024

tvvignesh commented Sep 18, 2024