memaryParse #44

kingjulio8238 · 2024-06-08T15:29:56Z

memary currently parses the agents' responses, which are stored in a .txt file, before inserting them into our knowledge graphs.

As we look to support agentic systems running real-world tasks, our memory unit needs to allow the system's maintainer to pre-process the knowledge graph with relevant data. For example, an e-commerce company wants to upload their users' information so that the agent can initially respond with context.

Companies may present this data in various file formats, such as .csv, .pdf, .txt, .pptx, or others. That is why memary must support many configurable parsers under a parent parser - memaryParse. For example, a company running an agent with data in .csv and .docx files can configure a parent retriever that supports both formats to pre-process the data into the knowledge graph before running their agents using memary.

We expect memaryParse to expand over time. Initially, we hope to support the following formats:

.txt (already configured)
table extraction
JSON
Images (.jpg, .jpeg, .png, .gif)
Document and presentations (.pdf, .doc / .docx, .rtf, .pages, .pptx, .xml, .key)
Web (htm, html)
Spreadsheets (.xlsx, .xls, .csv, .numbers)

memaryParse should also support the following result types: TXT, MD, and JSON (we will look to add others in the future).

Resource for inspiration: https://github.com/run-llama/llama_parse/blob/main/llama_parse/utils.py

rawwerks · 2024-06-16T14:43:33Z

for PDFs, please just let me put my llamaindex API key to use llamaparse. they have worked so hard on this, i would strongly advise not to re-invent it.

for other datatypes (and for getting the text from llamaparse into a KG), here are some resources. neo4j has done a lot of work on this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memaryParse #44

memaryParse #44

kingjulio8238 commented Jun 8, 2024

rawwerks commented Jun 16, 2024 •

edited

Loading

memaryParse #44

memaryParse #44

Comments

kingjulio8238 commented Jun 8, 2024

rawwerks commented Jun 16, 2024 • edited Loading

rawwerks commented Jun 16, 2024 •

edited

Loading