Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Go] retrieve and index calls are always actions #291

Merged
merged 1 commit into from
May 31, 2024
Merged

Conversation

jba
Copy link
Contributor

@jba jba commented May 31, 2024

Add ai.DefineRetriever, which creates and registers core.Actions for
the index and retrieve methods of a vector DB.

Remove RegisterRetriever.

Remove the Init functions from our vector DB plugins.
Instead, the New functions return a Retriever that supports
actions directly.

There is no longer a way to create a Retriever that doesn't
use actions.

Add ai.DefineRetriever, which creates and registers core.Actions for
the index and retrieve methods of a vector DB.

Remove RegisterRetriever.

Remove the Init functions from our vector DB plugins.
Instead, the New functions return a Retriever that supports
actions directly.

There is no longer a way to create a Retriever that doesn't
use actions.
@jba
Copy link
Contributor Author

jba commented May 31, 2024

Open questions:

  • In Go there is an interface, ai.Retriever, that provides both retrieve and index capabilities. That is nice to have, since those two things are (indeed must be?) handled by the same backend. There is only one value to pass around, and when creating one, the same underlying values (like the client) can be used for both (whereas the JS calls new Pinecone twice). Nevertheless, is it worth keeping, since it's inconsistent with JS?

  • If we do keep the two-method interface, Retriever is a confusing name for it. Should we rename it, perhaps to VectorDB?

  • The Go DefineRetriever is much simpler than the corresponding JS defineRetriever. What is missing? Offhand I see metadata, perhaps some options, and the line (i) => runner(new Document(i.query), i.options). I'm not sure if Go is missing these or we just have a different design.

  • The Retriever that you get from DefineRetriever contains the actions, so the user just needs to call the methods on the Retriever directly. But the user cannot access the actions themselves. In JS, I think that the actions are returned, and the user must call the retrieve function with the action to run it. Is there an advantage to giving the actions to the user? Can they do something with them other than invoke them?

@ssbushi
Copy link
Contributor

ssbushi commented May 31, 2024

My take on your open questions:

Open questions:

is it worth keeping [the Retriever interface], even if it's inconsistent with JS?
If we do keep the two-method interface, Retriever is a confusing name for it. Should we rename it, perhaps to VectorDB?

I personally like the Retriever interface for the reasons you already mentioned. I even had a common object in JS at one point called DocumentStore, but we scrapped that for reasons I don't remember. If I had to rename the interface, DocumentStore would be a better option than VectorDB since the underlying retrievers/indexers do not have to use vectors. I'm open to other suggestions (or hearing more on whether folks would rather not have this interface at all).

The Go DefineRetriever is much simpler than the corresponding JS defineRetriever. What is missing?

The metadata is the one that I see as the obvious difference. It's used by the DevUI to provide playgrounds. @pavelgj may know other differences that I'm missing. The JS defineRetriever does the conversion from the query to a Document which is what you see in (i) => runner(new Document(i.query), i.options) but since Go's Retriever.Retrieve accepts RetrieverRequest which only take a Document, this is explicit in Go's current design.

The Retriever that you get from DefineRetriever contains the actions, so the user just needs to call the methods on the Retriever directly. But the user cannot access the actions themselves. In JS, I think that the actions are returned, and the user must call the retrieve function with the action to run it. Is there an advantage to giving the actions to the user? Can they do something with them other than invoke them?

I don't think we need to expose actions to the user. If they need them, they can use LookupAction to get them. Again, @pavelgj should know of more advanced cases.

Comment on lines +59 to +62
ia := core.NewAction(name, core.ActionTypeIndexer, nil, func(ctx context.Context, req *IndexerRequest) (struct{}, error) {
return struct{}{}, index(ctx, req)
})
core.RegisterAction(name, ia)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: add core.DefineAction function that does this -- NewAction + RegisterAction.

Ex:

export function defineAction<

@pavelgj
Copy link
Collaborator

pavelgj commented May 31, 2024

The Go DefineRetriever is much simpler than the corresponding JS defineRetriever. What is missing?

The metadata is the one that I see as the obvious difference. It's used by the DevUI to provide playgrounds. @pavelgj may know other differences that I'm missing. The JS defineRetriever does the conversion from the query to a Document which is what you see in (i) => runner(new Document(i.query), i.options) but since Go's Retriever.Retrieve accepts RetrieverRequest which only take a Document, this is explicit in Go's current design.

I don't think Dev UI relies too much on retriever metadata today... this sounds fine. Haven't had a chance to play with retrievers in the Dev UI yet.

The Retriever that you get from DefineRetriever contains the actions, so the user just needs to call the methods on the Retriever directly. But the user cannot access the actions themselves. In JS, I think that the actions are returned, and the user must call the retrieve function with the action to run it. Is there an advantage to giving the actions to the user? Can they do something with them other than invoke them?

I don't think we need to expose actions to the user. If they need them, they can use LookupAction to get them. Again, @pavelgj should know of more advanced cases.

I'm not opposed to using actions directly, there's no usability problems with that (we actually allow using actions directly in JS as well), the question is more about whether it's important to support referencing retrievers using a string. We know that it's important for models for dotprompt files, but that's not in code. I think it's important to establish a good pattern where the devX allows to define retrievers, models, etc. and then reference them down the call stack. The issue is that it requires that the developer passes the references to these actions throughout the code and I'm not a huge fan of that. Using a string you can define/configure the retriever in one spot/file/package and then just reference it from another without having to deal with obtaining the reference. I'm not sure if there's a clean way to support both in Go... in JS the union types make it easy string | ModelReference | ModelAction.

@jba jba merged commit 8a65fea into main May 31, 2024
5 checks passed
@jba jba deleted the jba-define-ret branch May 31, 2024 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants