Is soft prompting enabled or is there a plan to do so? #4885

robhaslinger · 2024-01-12T03:29:07Z

Does llama.cpp have (or is it planned to have) the ability to use soft prompts? In principle I think this should be reasonably straightforward as it's just prepending the soft prompt embeddings to the front of the embedded hard prompts. I don't know if the GGUF format quantizes the embedding layer or not ... I suppose that might put a wrinkle in things if so.

Apologies if this is already enabled or if this has been discussed previously, I searched for a while and couldn't find anything. Thanks for your work on this repo, it's amazing.

Cheers Rob

ggerganov · 2024-01-12T11:12:09Z

I'm not familiar with "soft prompts" but based on your description, you can first submit a llama_batch with the embeddings (use the float * embd; member) and then proceed as usual.

robhaslinger · 2024-01-12T15:39:32Z

Basically you add a few 'virtual tokens' at the front of your prompts by initializing random embeddings and then tune those embeddings while keeping the parameters of the base model fixed. Supposedly it gets around the trial and error of writing 'hard' prompts in plain text. There's some discussion here: https://huggingface.co/docs/peft/task_guides/clm-prompt-tuning To be honest I'm not sure how well this works compared with qlora, probably not as well, but I was interested in it as a light weight way of rapidly switching between tasks. I'm also under the impression that the finetuning is a lot quicker than qlora as the number of tunable parameters is far fewer.

Thanks for pointing out that you can directly input embeddings. I do see that there's something about inputing either tokens or embeddings into the llama_batch class. I don't understand the details yet as I'm newer to your code base but I'll hunt through the code and try to figure it out.

Cheers Rob

github-actions · 2024-03-18T01:33:57Z

This issue is stale because it has been open for 30 days with no activity.

github-actions · 2024-04-03T01:14:03Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

robhaslinger added the enhancement New feature or request label Jan 12, 2024

NawafAlansari mentioned this issue Feb 16, 2024

Baby-llama.cpp report bus error #4830

Closed

github-actions bot added the stale label Mar 18, 2024

github-actions bot closed this as completed Apr 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is soft prompting enabled or is there a plan to do so? #4885

Is soft prompting enabled or is there a plan to do so? #4885

robhaslinger commented Jan 12, 2024

ggerganov commented Jan 12, 2024 •

edited

Loading

robhaslinger commented Jan 12, 2024

github-actions bot commented Mar 18, 2024

github-actions bot commented Apr 3, 2024

Is soft prompting enabled or is there a plan to do so? #4885

Is soft prompting enabled or is there a plan to do so? #4885

Comments

robhaslinger commented Jan 12, 2024

ggerganov commented Jan 12, 2024 • edited Loading

robhaslinger commented Jan 12, 2024

github-actions bot commented Mar 18, 2024

github-actions bot commented Apr 3, 2024

ggerganov commented Jan 12, 2024 •

edited

Loading