Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support internlm2 model #2527

Closed
wants to merge 4 commits into from
Closed

Support internlm2 model #2527

wants to merge 4 commits into from

Conversation

esmeetu
Copy link
Collaborator

@esmeetu esmeetu commented Jan 21, 2024

Model info: https://huggingface.co/internlm/internlm2-chat-7b

Currently this doesn't work using TP2 which output garbled text, and i haven't test on TP1.

Test code:

from vllm import LLM, SamplingParams

prompts = [
    "<s><|im_start|>user\nhello<|im_end|>\n<|im_start|>assistant\n"
]
sampling_params = SamplingParams(temperature=0.0, max_tokens=64)

llm = LLM(model="internlm/internlm2-chat-7b", trust_remote_code=True, tensor_parallel_size=2, dtype="half", enforce_eager=True)
outputs = llm.generate(prompts, sampling_params)

# Print the outputs.
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

Result:

Prompt: '<s><|im_start|>user\nhello<|im_end|>\n<|im_start|>assistant\n', Generated text: ' that that that that that that that which which which which which which which which which which which  and and and and and������������� <|im_end|> 是� <|im_end|> 是���是� <|im_end|> 是����� <|im_end|> <|im_end|> <|im_end|> <|im_end|> <|im_end|> <|im_end|> <|im_end|> <|im_end|> <|im_end|> <|im_end|>'

Official Result:

Hello! How can I assist you today?

I don't know why self.wqkv(hidden_states) result is not right. Could someone help based my current PR?

@esmeetu esmeetu changed the title [WIP] Support internlm2 Support internlm2 model Jan 28, 2024
@esmeetu esmeetu marked this pull request as ready for review January 28, 2024 07:46
@esmeetu
Copy link
Collaborator Author

esmeetu commented Jan 28, 2024

Hi, @zhuohan123. This PR currently works after some fix on loading weight. But i have two question:

  1. Does from einops import rearrange could be replaced by torch's function? Therefore we can remove this dependency.
  2. Could you have any ideas about the difference between the wqkv weight file format with vLLM's inner default qkv_proj weight based my weight loading implement.
    Anyway, i will explore more about these.

@Leymore Leymore mentioned this pull request Jan 30, 2024
@esmeetu
Copy link
Collaborator Author

esmeetu commented Jan 30, 2024

Close this since there's a better support in #2666.

@esmeetu esmeetu closed this Jan 30, 2024
@esmeetu esmeetu deleted the internlm2 branch February 14, 2024 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant