Use any document as context #112
Replies: 1 comment 2 replies
-
It's an old question, but I'm facing into the same problem. From the example with the functions:
With this prompt: "Search on document: What did Biden say about Ketanji Brown Jackson is the state of the union address?" The result will be something like that: "Biden said in the State of the Union address that he nominated Judge Ketanji Brown Jackson to serve on the United States Supreme Court. He highlighted her as one of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence. You need?" Below is a (not optimized) example: import * as fs from 'fs'
import { FaissStore } from '@langchain/community/vectorstores/faiss'
import { RecursiveCharacterTextSplitter } from 'langchain/text_splitter'
import { HuggingFaceTransformersEmbeddings } from '@langchain/community/embeddings/hf_transformers'
const embeddings = new HuggingFaceTransformersEmbeddings({
modelName: 'Xenova/all-MiniLM-L6-v2'
})
const text = fs.readFileSync('./documents/state_of_the_union.txt', 'utf8')
const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 512 })
const docs = await textSplitter.createDocuments([text])
const vectorStore = await FaissStore.fromDocuments(docs, embeddings)
export default {
description: 'Call this if user ask to retrieve the most recent informations about the State of the Union address',
params: {
type: 'object',
properties: {
input: {
description: 'what you want retrieve from this document',
type: 'string'
}
}
},
async handler({ input }) {
console.log('call getDocumentContent')
const result = await vectorStore.similaritySearch(input, 1)
console.log(result[0].pageContent)
return result[0].pageContent // just for test, to refine
}
} My 2 cent: I've tried to do everything on langchain. For example, I've created a class that use (although in beta) node-llama-cpp as an LLM with langchain agents, etc., because I find them very interesting. But imho, they are too focused on OpenAI, so every example that works with OpenAI is likely to fail with the rest. At least for now. The beta 3 of node-llama-cpp is awesome; it works well, and the functions are a big step forward. Currently, it only supports a few models but I expect a lot from this side. So, by use langchain's retrievers and node-llama-cpp's functions / chat history, it's possible to achieve good results. |
Beta Was this translation helpful? Give feedback.
-
My question would be what is the best way to use the model to analyze a large document. From what I noticed, just passing it as common text does not work. I would like an example of how to split the text and provide it as context, as is done with langchain
Beta Was this translation helpful? Give feedback.
All reactions