Enhance chat memory #226

edwardzjl · 2023-12-25T16:15:24Z

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Enhance chat memory by long-term memo (summary), searched memo (vector retriever) and short-term memo (buffer window, current solution), and combine them together (CombinedMemory).

edwardzjl · 2023-12-26T16:14:20Z

Before utilizing any token-counting memory, such as langchain.memory.ConversationSummaryBufferMemory, customization is required.

These memories rely on the functionality of langchain_core.language_models.base.BaseLanguageModel.get_num_tokens_from_messages to calculate the input token length. However, this calculation is contingent on langchain_core.messages.get_buffer_string, which may not accurately represent the string in certain scenarios, such as when using chat templates like chatml.

Additionally, the langchain_core.language_models.base.BaseLanguageModel.get_num_tokens_from_messages function invokes langchain_core.language_models.base.BaseLanguageModel.get_token_ids, which defaults to GPT2TokenizerFast.from_pretrained("gpt2"). It's important to note that this default setting could lead to inaccuracies if I'm using another LLM instead of ChatGPT.

edwardzjl · 2023-12-29T14:55:21Z

I digged a little into vector based memories, here's some pickups:

There's a default implementation langchain.memory.VectorStoreRetrieverMemory. (doc)
However, the default implementation lacks support for chat memory.
- chat memories have a return_messages attibute that controls whether you load history into str or BaseMessages.
- VectorStoreRetrieverMemory have a return_docs attibute that controls whether the history is loaded into str or Documents.
What's more, it's difficult (although possible) to separate VectorStoreRetrieverMemorys into different sessions.
There's a langchain.memories.CombinedMemory that can be used to combine multiple memories into chain or agent. (doc)
However, if I combine VectorStoreRetrieverMemory with ConversationBufferWindowMemory, the chat history will be persist twice.
- Once in redis vectorstore, once in redis chat history
I think it's possible and intuitive to customize one memory implementation, instead of using CombinedMemory, to combine these two behaviours:
- vectorstore memory as long-term memory
- buffer window memory as short-term memory

edwardzjl · 2024-01-02T15:41:58Z

What about a background task (like kubernetes CronJob) to read all memories, and put those "old ones" into a vectorstore?

If I insert them into vectorstore then delete them from buffered window (backed by a list), I might need to add transactions.
If I insert them into vectorstore and leave them in the buffered window, there's more problems:
- There's potentially huge disk waste.
- I might need to deal with duplicate records (which might be solved by langchain index).
- The cron job may take very long to finish if the user and history grows.

Neither is simple to implement.

edwardzjl · 2024-01-05T09:33:42Z

A chatbot is not like a human been. Humans don't inherently retain conversation message orders; instead, we rely on short-term memory, which eventually consolidates information into a long-term memory where order becomes less crucial.

In contrast, a chatbot must persistently maintain all messages in a list-order, primarily for later user display.

Additionally, it's worth noting that Redis indexing is limited to hash or JSON structures, not lists. I cannot simply add embedding index on list of chat messages.

To address the need for both long-term and short-term memory while preserving the correct order of messages for user display, a solution involves duplicating the messages. One copy is stored in a Redis list, and another is stored in a hash with an embedded index.

An alternative approach might involve storing only the list index in the document, as shown below:

{
  "msg_embedding": [],
  "msg_idx": 0
}

Fetching messages with context can then be achieved by accessing the message list using LRANGE, which has an acceptable time complexity of O(S+N) (LRANGE).

However, it's important to note that the current behavior of langchain's langchain_community.chat_message_histories.redis.RedisChatMessageHistory, which stores messages in reverse order using LPUSH (placing the latest message at the beginning of the list), necessitates a modification. This adjustment is crucial to maintain consistency, as each new message alters the index of all existing messages.

edwardzjl · 2024-02-22T14:21:43Z

langchain's memory system is under refactoring, maybe I will wait some more weeks until it's getting stable.

edwardzjl added the enhancement New feature or request label Dec 25, 2023

edwardzjl self-assigned this Dec 25, 2023

edwardzjl added the python Pull requests that update Python code label Jan 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance chat memory #226

Enhance chat memory #226

edwardzjl commented Dec 25, 2023

edwardzjl commented Dec 26, 2023

edwardzjl commented Dec 29, 2023

edwardzjl commented Jan 2, 2024

edwardzjl commented Jan 5, 2024

edwardzjl commented Feb 22, 2024

Enhance chat memory #226

Enhance chat memory #226

Comments

edwardzjl commented Dec 25, 2023

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

edwardzjl commented Dec 26, 2023

edwardzjl commented Dec 29, 2023

edwardzjl commented Jan 2, 2024

edwardzjl commented Jan 5, 2024

edwardzjl commented Feb 22, 2024