Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Message drafts #3044

Merged
merged 29 commits into from
May 31, 2023
Merged

Message drafts #3044

merged 29 commits into from
May 31, 2023

Conversation

someone13574
Copy link
Contributor

@someone13574 someone13574 commented May 5, 2023

closes #2931 (slightly changed goal based on advice from the discord, generate full messages, not 'x' tokens. Full messages are more useful data)

  • Create draft selection UI
  • Draft inference
  • Option to regenerate drafts and serve 3 new ones
  • Remember last viewed sibling message
  • Store selected draft training data for RLHF
  • [ ] Disable drafts when queue is too long / server is under load (Suggested to leave to next PR in the discord)
  • Draft markdown rendering
  • 'Used plugin' UI for drafts
  • Resolve merge conflicts

@github-actions
Copy link

github-actions bot commented May 5, 2023

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

2 similar comments
@github-actions
Copy link

github-actions bot commented May 6, 2023

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

@github-actions
Copy link

github-actions bot commented May 6, 2023

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

@someone13574 someone13574 changed the title [WIP] Message drafts Message drafts May 13, 2023
@someone13574 someone13574 marked this pull request as ready for review May 13, 2023 01:06
@@ -19,6 +19,7 @@ class DbMessage(SQLModel, table=True):
reports: list["DbReport"] = Relationship(back_populates="message")

parent_id: str | None = Field(None)
active_sibling: bool | None = Field(None, sa_column=sa.Column(pg.JSONB))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does active_sibling mean exactly? Feels like an odd name

Copy link
Contributor Author

@someone13574 someone13574 May 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a conversation tree you can have messages which you can switch between using the arrows at the bottom. This variable just stores whether or not that particular message was the one which was the last showed thread. This way when a user makes a long thread it won't 'disappear' when they load the conversation again. The 'sibling' part is referring to how these messages come from the same parent message so I thought it would be a good term to use. If you have a better alternative I can change it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ok. This feels like it is really information about the chat, not the message. Can we not just store a single active_message value in the chat table, which is the final message of the currently active path down the chat tree? Then from this it is already implied what siblings are active earlier in the tree

This would also simplify the endpoint, we could merge it into update_chat endpoint

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry only seeing this now: How are the multiple generated draft messages now referenced in the database? And was is finally stored in the chat-message tree?

Copy link
Contributor Author

@someone13574 someone13574 May 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The draft messages are stored as regular messages in the messages table. The message comparisons are stored in a new table called message_eval which contains the chat id, user id, message id of the selected message, and the message ids it was selected over.

I’ve renamed the active_message_id property to active_thread_tail_message_id to make it more clear that it’s purpose is storing the last viewed thread and not any sort of message comparison.

inference/server/oasst_inference_server/models/chat.py Outdated Show resolved Hide resolved
@olliestanley
Copy link
Collaborator

olliestanley commented May 13, 2023

Thanks, I think this mostly makes sense from inference backend perspective, I left a couple of questions and one change request

@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

Copy link
Collaborator

@olliestanley olliestanley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This seems fine to me from backend perspective now. Will approve tomorrow if neither of Yannic/Andreas wish to review first

@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

@someone13574 someone13574 requested a review from notmd May 28, 2023 04:58
Copy link
Collaborator

@notmd notmd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, need 1 more approval from Yannic or Andreas.

Copy link
Collaborator

@andreaskoepf andreaskoepf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this important feature.
I would appreciate feedback to the following points:

  1. It seems as if the draft generation would always be active when NUM_GENERATED_DRAFTS > 1. This will completely break any regular chat-experience and force the user to go through the message-selection process, even for simple "Hello" prompts. In general we definitely want to collect this form of feedback data but the question is if we really should make regular use of assistant impossible. If message alternatives would be generated when user presses regenerate or down-vote it would IMO be a much more unobtrusive experience and users would get additional option when they actually need them.

  2. It is not clear to me how draft messages are tracked now in the database. Could you please explain the meaning the active_message_id column in chat table (and why is it a json field)? How are drafts messages stored and referenced in the database?

@someone13574
Copy link
Contributor Author

Thanks a lot for this important feature. I would appreciate feedback to the following points:

1. It seems as if the draft generation would always be active when `NUM_GENERATED_DRAFTS > 1`. This will completely break any regular chat-experience and force the user to go through the message-selection process, even for simple "Hello" prompts. In general we definitely want to collect this form of feedback data but the question is if we really should make regular use of assistant impossible. If message alternatives would be generated when user presses regenerate or down-vote it would IMO be a much more unobtrusive experience and users would get additional option when they actually need them.

2. It is not clear to me how draft messages are tracked now in the database. Could you please explain the meaning the `active_message_id` column in `chat` table (and why is it a json field)?  How are drafts messages stored and referenced in the database?
  1. I'll make the drafts less intrusive by only generating drafts if either the prompt is long (thus indicating a more complex request) or if the user regenerates a message (indicating that additional options would be beneficial). This should make the process less intrusive and act as a filter against data we really don't need comparisons for (trivial tasks or general conversation).
  2. Draft messages are stored the same way as normal message and the comparisons are stored in an new table called message_eval where it has the id of the selected message and the id's of the messages it was selected over. The active_message_id stores the id of the last message in the currently visible thread so that the conversation can be loaded in the same state as it was when a user closes it. This is because all the unselected drafts are also added to the conversation as other threads and the last message for a given parent message wont always be the one the user was actually using, so loading onto that unused message instead of where the user actually was writing was very bad for ux, so I made it remember which thread was last selected. I'd be open to a better name. Also using a json was a mistake, I'll fix it.

@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

1 similar comment
@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

@andreaskoepf andreaskoepf merged commit 70f30a6 into LAION-AI:main May 31, 2023
@someone13574 someone13574 deleted the message-drafts branch May 31, 2023 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Message ranking In Conversations
4 participants