Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Model architectures ['LlavaForCausalLM'] are not supported for now in vllm 0.4.0.post1 #4008

Closed
stikkireddy opened this issue Apr 11, 2024 · 13 comments
Labels
bug Something isn't working

Comments

@stikkireddy
Copy link

Your current environment

The output of `python collect_env.py`

🐛 Describe the bug

I am using:

from vllm import LLM
model = LLM(
            model="llava-hf/llava-1.5-13b-hf",
            image_input_type="pixel_values",
            download_dir="/tmp/models",
            image_token_id=32000,
            image_input_shape="1,3,336,336",
            image_feature_size=576,
        )

which throws error: ValueError: Model architectures ['LlavaForCausalLM'] are not supported for now. Supported architectures: ['AquilaModel', 'AquilaForCausalLM', 'BaiChuanForCausalLM', 'BaichuanForCausalLM', 'BloomForCausalLM', 'ChatGLMModel', 'ChatGLMForConditionalGeneration', 'CohereForCausalLM', 'DbrxForCausalLM', 'DeciLMForCausalLM', 'DeepseekForCausalLM', 'FalconForCausalLM', 'GemmaForCausalLM', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTJForCausalLM', 'GPTNeoXForCausalLM', 'InternLMForCausalLM', 'InternLM2ForCausalLM', 'JAISLMHeadModel', 'LlamaForCausalLM', 'LlavaForConditionalGeneration', 'LLaMAForCausalLM', 'MistralForCausalLM', 'MixtralForCausalLM', 'QuantMixtralForCausalLM', 'MptForCausalLM', 'MPTForCausalLM', 'OLMoForCausalLM', 'OPTForCausalLM', 'OrionForCausalLM', 'PhiForCausalLM', 'QWenLMHeadModel', 'Qwen2ForCausalLM', 'Qwen2MoeForCausalLM', 'RWForCausalLM', 'StableLMEpochForCausalLM', 'StableLmForCausalLM', 'Starcoder2ForCausalLM', 'XverseForCausalLM']

After checking huggingface llava-1.5-7b-hf uses LlavaForConditionalGeneration and llava-1.5-13b-hf uses LlavaForCausalLM?

Any easy workaround / fix for this?

@stikkireddy stikkireddy added the bug Something isn't working label Apr 11, 2024
@stikkireddy
Copy link
Author

maybe this is a feature request idk enough about the intent of supporting llava.

@DarkLight1337
Copy link
Member

Don't know how I missed that when writing PR #3978. Interesting... was the 7b model modified specifically to facilitate the proof-of-concept for vLLM?

@stikkireddy
Copy link
Author

Unsure I went and reviewed this #3042 and it seems llava 7b and 13b have different classes. We only supported LlavaForConditionalGeneration but not LlavaForCausalLM.

@stikkireddy
Copy link
Author

@xwjiang2010 do you happen to know the effort to support LLaVa 13b. Pinging you since you worked on the initial PR for vision models.

@DarkLight1337
Copy link
Member

DarkLight1337 commented Apr 11, 2024

I think it's just a typo in their HuggingFace config.json. I can successfully load the 13b model in vLLM by pretending that it has LlavaForConditionalGeneration architecture. I applied the following patch on top of DarkLight1337:openai-vision-api:

diff --git a/tests/conftest.py b/tests/conftest.py
index a7e8963..61ca4fe 100644
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -131,6 +131,7 @@ _STR_DTYPE_TO_TORCH_DTYPE = {
 
 _VISION_LANGUAGE_MODELS = {
     "llava-hf/llava-1.5-7b-hf": LlavaForConditionalGeneration,
+    "llava-hf/llava-1.5-13b-hf": LlavaForConditionalGeneration,
 }
 
 
diff --git a/vllm/model_executor/models/__init__.py b/vllm/model_executor/models/__init__.py
index 17fc970..54a077d 100755
--- a/vllm/model_executor/models/__init__.py
+++ b/vllm/model_executor/models/__init__.py
@@ -33,6 +33,8 @@ _MODELS = {
     "LlamaForCausalLM": ("llama", "LlamaForCausalLM"),
     "LlavaForConditionalGeneration":
     ("llava", "LlavaForConditionalGeneration"),
+    "LlavaForCausalLM":
+    ("llava", "LlavaForConditionalGeneration"),
     # For decapoda-research/llama-*
     "LLaMAForCausalLM": ("llama", "LlamaForCausalLM"),
     "MistralForCausalLM": ("llama", "LlamaForCausalLM"),

Update: I have tested with some other models:

  • There is no need to apply this patch to llava-hf/bakLlava-v1-hf since it has LlavaForConditionalGeneration architecture as advertised.
  • Even with this patch, I cannot load models with LlavaLlamaForCausalLM architecture, notably liuhaotian/llava-v1.5-7b.

@stikkireddy
Copy link
Author

hmm okay then i can probably clone their model and just modify the config.json to use the LlavaForConditionalGeneration architecture.

@DarkLight1337
Copy link
Member

I have updated my PR accordingly.

@Jianzhao-Huang
Copy link

Jianzhao-Huang commented Apr 16, 2024

I think it's just a typo in their HuggingFace config.json. I can successfully load the 13b model in vLLM by pretending that it has LlavaForConditionalGeneration architecture. I applied the following patch on top of DarkLight1337:openai-vision-api:

diff --git a/tests/conftest.py b/tests/conftest.py
index a7e8963..61ca4fe 100644
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -131,6 +131,7 @@ _STR_DTYPE_TO_TORCH_DTYPE = {
 
 _VISION_LANGUAGE_MODELS = {
     "llava-hf/llava-1.5-7b-hf": LlavaForConditionalGeneration,
+    "llava-hf/llava-1.5-13b-hf": LlavaForConditionalGeneration,
 }
 
 
diff --git a/vllm/model_executor/models/__init__.py b/vllm/model_executor/models/__init__.py
index 17fc970..54a077d 100755
--- a/vllm/model_executor/models/__init__.py
+++ b/vllm/model_executor/models/__init__.py
@@ -33,6 +33,8 @@ _MODELS = {
     "LlamaForCausalLM": ("llama", "LlamaForCausalLM"),
     "LlavaForConditionalGeneration":
     ("llava", "LlavaForConditionalGeneration"),
+    "LlavaForCausalLM":
+    ("llava", "LlavaForConditionalGeneration"),
     # For decapoda-research/llama-*
     "LLaMAForCausalLM": ("llama", "LlamaForCausalLM"),
     "MistralForCausalLM": ("llama", "LlamaForCausalLM"),

Update: I have tested with some other models:

  • There is no need to apply this patch to llava-hf/bakLlava-v1-hf since it has LlavaForConditionalGeneration architecture as advertised.
  • Even with this patch, I cannot load models with LlavaLlamaForCausalLM architecture, notably liuhaotian/llava-v1.5-7b.

Thank you for your effort in supporting llava in vllm! Indeed this is a great job.

After checking the code in original llava code repo(https://github.com/haotian-liu/LLaVA), I found exactly llava-1.5-13b use the LlavaForCausalLM class, so I think it's not a typo in the HuggingFace's config.json.

I don't know whether there will be a runtime bug or some performance losses after directly modifying the config.json to use the LlavaForConditionalGeneration architecture in llava-1.5-13b. Could you please check this issue?

@DarkLight1337
Copy link
Member

DarkLight1337 commented Apr 16, 2024

Thank you for your effort in supporting llava in vllm! Indeed this is a great job.

After checking the code in original llava code repo(https://github.com/haotian-liu/LLaVA), I found exactly llava-1.5-13b use the LlavaForCausalLM class, so I think it's not a typo in the HuggingFace's config.json.

I don't know whether there will be a runtime bug or some performance losses after directly modifying the config.json to use the LlavaForConditionalGeneration architecture in llava-1.5-13b. Could you please check this issue?

@Jianzhao-Huang It doesn't look like HuggingFace has LlavaForCausalLM, so I think both models should use LlavaForConditionalGeneration as the name.

In any case, I have just pushed a commit to #3978 which adds the 13b model to the LLaVA test case that checks its consistency against the native HuggingFace model.

@DarkLight1337
Copy link
Member

DarkLight1337 commented Apr 17, 2024

Hmm, seems that the GPU memory fails to be freed between testing each model. Does anyone know how to fix this issue?

@DarkLight1337
Copy link
Member

DarkLight1337 commented Apr 18, 2024

I have investigated further and it seems that the CI/CD infrastructure cannot even load the 13B model into memory (I removed all other LLaVA models from the test and it still OOMed). Not sure what I can do about that...

@Iven2132
Copy link

Any fix?

@DarkLight1337
Copy link
Member

The incorrect architecture in config.json has been fixed on HuggingFace, as per this discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants