Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demo is broken #3

Open
Lyken17 opened this issue Apr 4, 2023 · 3 comments
Open

Demo is broken #3

Lyken17 opened this issue Apr 4, 2023 · 3 comments

Comments

@Lyken17
Copy link

Lyken17 commented Apr 4, 2023

It seems the huggingface's demo is broken now

image

After cloning the local and running some simple tests, the issues should be related to the attention / tokenizer (I guess?)

File ~/anaconda3/envs/pth/lib/python3.9/site-packages/transformers/generation/utils.py:737, in GenerationMixin._update_model_kwargs_for_generation(self, outputs, model_kwargs, is_encoder_decoder, standardize_cache_format)
    735     if "attention_mask" in model_kwargs:
    736         attention_mask = model_kwargs["attention_mask"]
--> 737         model_kwargs["attention_mask"] = torch.cat(
    738             [attention_mask, attention_mask.new_ones((attention_mask.shape[0], 1))], dim=-1
    739         )
    740 else:
    741     # update decoder attention mask
    742     if "decoder_attention_mask" in model_kwargs:

RuntimeError: Tensors must have same number of dimensions: got 4 and 2
@Lyken17
Copy link
Author

Lyken17 commented Apr 4, 2023

Problem located. THU-GLM has changed pos and attention mask in https://huggingface.co/THUDM/chatglm-6b/commit/373fd6b9d484841b490856a5570d6c450f20c22c

Thus, swiching from latest impl would solve the issue

# from modeling_chatglm import ChatGLMForConditionalGeneration
# from transformers import AutoTokenizer, GenerationConfig
# model = ChatGLMForConditionalGeneration.from_pretrained("THUDM/chatglm-6b").float()
# tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).float()

@ljsabc
Copy link
Owner

ljsabc commented Apr 5, 2023

The upstream is keeping up with breaking changes, while I wrongly setup my huggingface space without pinning the upstream.
I'm thinking if it's better that we make everything from an Automodel such that everyone of us can eat our own dog food.

Will keep this issue updated.

@ljsabc
Copy link
Owner

ljsabc commented Apr 5, 2023

Have updated the repo but not fully assured everything is working now. As the model weights are not changed at all I am like 90% confident. Will keep this issue open for a while as I am testing with the new environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants