You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I hope this is the right place to ask this question. Let me know if I need to move to another repo.
Currently I'm using NeuronModelForCausalLM which uses LlamaForSampling under the hood.
I have a use case where I need to be able to do the following:
Generate embedding tokens
Modify embedding tokens
Run inference from modified embedding tokens
I am able to do steps 1 & 2 currently using the following:
from optimum.neuron import NeuronModelForCausalLM
llama_model = NeuronModelForCausalLM.from_pretrained('aws-neuron/Llama-2-7b-chat-hf-seqlen-2048-bs-1')
embedded_tokens = llama_model.model.chkpt_model.model.embed_tokens(token_ids)
### Code to modify embedded_tokens
However, as far as I can tell, generation with these modified tokens is not possible with llama_model.generate()
When I use the 'input_embeds' keyword argument, and set input_ids=None, I get the following:
ValueError: The following `model_kwargs` are not used by the model: ['inputs_embeds']
If this is not possible with the NeuronModelForCausalLM.generate() currently, is there a way to work around this manually? If so, could you provide an example?
Thanks very much for your help!
The text was updated successfully, but these errors were encountered:
I hope this is the right place to ask this question. Let me know if I need to move to another repo.
Currently I'm using
NeuronModelForCausalLM
which usesLlamaForSampling
under the hood.I have a use case where I need to be able to do the following:
I am able to do steps 1 & 2 currently using the following:
However, as far as I can tell, generation with these modified tokens is not possible with
llama_model.generate()
When I use the 'input_embeds' keyword argument, and set
input_ids=None
, I get the following:If this is not possible with the NeuronModelForCausalLM.generate() currently, is there a way to work around this manually? If so, could you provide an example?
Thanks very much for your help!
The text was updated successfully, but these errors were encountered: