You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a proposal for multiple messages to be generated per prompt to collect additional RLHF data from user conversations. It would work by generating the first 'x' tokens of a message and then prompting the user to select one which would be generated to completion. The value of 'x' would change based on the load on the inference servers at the moment ranging from whole messages to completely disabling the feature under high load. This would give lots of ranking data for RLHF while not butchering the performance of the site, if the server is at low usage then you get lots of RLHF data while improving quality of life for users, if the server is at capacity then it is simply disabled. Giving the option to see 'x' number of tokens from a few possible answers would also reduce the number of regenerations reducing the net increase in load.
It would go something like this:
User: What is a croissant?
A croissant is a type of French pastry made with layers of yeast-raised dough that are rolled and folded several times. The resulting dough is shaped into an crescent shape and baked until flaky, crisp, and...
Croissants are buttery and delicious. They originated in Austria during the 13th century, with the first recorded use of the word “croissant” appearing in a French manuscript from the year...
A crescent shaped bread rolls often eaten at breakfast. (Sometimes short answers can fit within 'x' tokens)
(User selects reply 1)
Assistant: "A croissant is a type of French pastry made with layers of yeast-raised dough that are rolled and folded several times. The resulting dough is shaped into an crescent shape and baked until flaky, crisp, and golden brown in color.
Traditionally, croissants were made with butter, which gave them their signature rich flavor and flakiness. However, modern versions may also use margarine or other spreads to reduce the fat content. Croissants can be served plain as a breakfast bread or filled with various ingredients such as jam, cheese, or chocolate for added sweetness or savory flavors.
Croissants have become popular around the world, particularly in countries with strong ties to France such as Belgium, Switzerland, and Canada, as well as many other parts of Europe and North America where French cuisine has gained a following. They are often served at cafes alongside coffee or tea, making it a staple breakfast item."
The text was updated successfully, but these errors were encountered:
closes#2931 (slightly changed goal based on advice from the discord,
generate full messages, not 'x' tokens. Full messages are more useful
data)
- [x] Create draft selection UI
- [x] Draft inference
- [x] Option to regenerate drafts and serve 3 new ones
- [x] Remember last viewed sibling message
- [x] Store selected draft training data for RLHF
- ~~[ ] Disable drafts when queue is too long / server is under load~~
(Suggested to leave to next PR in the discord)
- [x] Draft markdown rendering
- [x] 'Used plugin' UI for drafts
- [x] Resolve merge conflicts
---------
Co-authored-by: notmd <[email protected]>
Co-authored-by: notmd <[email protected]>
This is a proposal for multiple messages to be generated per prompt to collect additional RLHF data from user conversations. It would work by generating the first 'x' tokens of a message and then prompting the user to select one which would be generated to completion. The value of 'x' would change based on the load on the inference servers at the moment ranging from whole messages to completely disabling the feature under high load. This would give lots of ranking data for RLHF while not butchering the performance of the site, if the server is at low usage then you get lots of RLHF data while improving quality of life for users, if the server is at capacity then it is simply disabled. Giving the option to see 'x' number of tokens from a few possible answers would also reduce the number of regenerations reducing the net increase in load.
It would go something like this:
User: What is a croissant?
A croissant is a type of French pastry made with layers of yeast-raised dough that are rolled and folded several times. The resulting dough is shaped into an crescent shape and baked until flaky, crisp, and...
Croissants are buttery and delicious. They originated in Austria during the 13th century, with the first recorded use of the word “croissant” appearing in a French manuscript from the year...
A crescent shaped bread rolls often eaten at breakfast. (Sometimes short answers can fit within 'x' tokens)
(User selects reply 1)
Assistant: "A croissant is a type of French pastry made with layers of yeast-raised dough that are rolled and folded several times. The resulting dough is shaped into an crescent shape and baked until flaky, crisp, and golden brown in color.
Traditionally, croissants were made with butter, which gave them their signature rich flavor and flakiness. However, modern versions may also use margarine or other spreads to reduce the fat content. Croissants can be served plain as a breakfast bread or filled with various ingredients such as jam, cheese, or chocolate for added sweetness or savory flavors.
Croissants have become popular around the world, particularly in countries with strong ties to France such as Belgium, Switzerland, and Canada, as well as many other parts of Europe and North America where French cuisine has gained a following. They are often served at cafes alongside coffee or tea, making it a staple breakfast item."
The text was updated successfully, but these errors were encountered: