[Go] retrieve and index calls are always actions #291

jba · 2024-05-31T12:29:35Z

Add ai.DefineRetriever, which creates and registers core.Actions for
the index and retrieve methods of a vector DB.

Remove RegisterRetriever.

Remove the Init functions from our vector DB plugins.
Instead, the New functions return a Retriever that supports
actions directly.

There is no longer a way to create a Retriever that doesn't
use actions.

Add ai.DefineRetriever, which creates and registers core.Actions for the index and retrieve methods of a vector DB. Remove RegisterRetriever. Remove the Init functions from our vector DB plugins. Instead, the New functions return a Retriever that supports actions directly. There is no longer a way to create a Retriever that doesn't use actions.

jba · 2024-05-31T12:46:01Z

Open questions:

In Go there is an interface, ai.Retriever, that provides both retrieve and index capabilities. That is nice to have, since those two things are (indeed must be?) handled by the same backend. There is only one value to pass around, and when creating one, the same underlying values (like the client) can be used for both (whereas the JS calls new Pinecone twice). Nevertheless, is it worth keeping, since it's inconsistent with JS?
If we do keep the two-method interface, Retriever is a confusing name for it. Should we rename it, perhaps to VectorDB?
The Go DefineRetriever is much simpler than the corresponding JS defineRetriever. What is missing? Offhand I see metadata, perhaps some options, and the line (i) => runner(new Document(i.query), i.options). I'm not sure if Go is missing these or we just have a different design.
The Retriever that you get from DefineRetriever contains the actions, so the user just needs to call the methods on the Retriever directly. But the user cannot access the actions themselves. In JS, I think that the actions are returned, and the user must call the retrieve function with the action to run it. Is there an advantage to giving the actions to the user? Can they do something with them other than invoke them?

ssbushi · 2024-05-31T14:34:25Z

My take on your open questions:

Open questions:

is it worth keeping [the Retriever interface], even if it's inconsistent with JS?
If we do keep the two-method interface, Retriever is a confusing name for it. Should we rename it, perhaps to VectorDB?

I personally like the Retriever interface for the reasons you already mentioned. I even had a common object in JS at one point called DocumentStore, but we scrapped that for reasons I don't remember. If I had to rename the interface, DocumentStore would be a better option than VectorDB since the underlying retrievers/indexers do not have to use vectors. I'm open to other suggestions (or hearing more on whether folks would rather not have this interface at all).

The Go DefineRetriever is much simpler than the corresponding JS defineRetriever. What is missing?

The metadata is the one that I see as the obvious difference. It's used by the DevUI to provide playgrounds. @pavelgj may know other differences that I'm missing. The JS defineRetriever does the conversion from the query to a Document which is what you see in (i) => runner(new Document(i.query), i.options) but since Go's Retriever.Retrieve accepts RetrieverRequest which only take a Document, this is explicit in Go's current design.

The Retriever that you get from DefineRetriever contains the actions, so the user just needs to call the methods on the Retriever directly. But the user cannot access the actions themselves. In JS, I think that the actions are returned, and the user must call the retrieve function with the action to run it. Is there an advantage to giving the actions to the user? Can they do something with them other than invoke them?

I don't think we need to expose actions to the user. If they need them, they can use LookupAction to get them. Again, @pavelgj should know of more advanced cases.

pavelgj · 2024-05-31T18:51:13Z

go/ai/retriever.go

+	ia := core.NewAction(name, core.ActionTypeIndexer, nil, func(ctx context.Context, req *IndexerRequest) (struct{}, error) {
+		return struct{}{}, index(ctx, req)
+	})
+	core.RegisterAction(name, ia)


Suggestion: add core.DefineAction function that does this -- NewAction + RegisterAction.

Ex:

genkit/js/core/src/action.ts

Line 209 in 5dd2eae

export function defineAction<

pavelgj · 2024-05-31T19:06:54Z

The Go DefineRetriever is much simpler than the corresponding JS defineRetriever. What is missing?

The metadata is the one that I see as the obvious difference. It's used by the DevUI to provide playgrounds. @pavelgj may know other differences that I'm missing. The JS defineRetriever does the conversion from the query to a Document which is what you see in (i) => runner(new Document(i.query), i.options) but since Go's Retriever.Retrieve accepts RetrieverRequest which only take a Document, this is explicit in Go's current design.

I don't think Dev UI relies too much on retriever metadata today... this sounds fine. Haven't had a chance to play with retrievers in the Dev UI yet.

The Retriever that you get from DefineRetriever contains the actions, so the user just needs to call the methods on the Retriever directly. But the user cannot access the actions themselves. In JS, I think that the actions are returned, and the user must call the retrieve function with the action to run it. Is there an advantage to giving the actions to the user? Can they do something with them other than invoke them?

I don't think we need to expose actions to the user. If they need them, they can use LookupAction to get them. Again, @pavelgj should know of more advanced cases.

I'm not opposed to using actions directly, there's no usability problems with that (we actually allow using actions directly in JS as well), the question is more about whether it's important to support referencing retrievers using a string. We know that it's important for models for dotprompt files, but that's not in code. I think it's important to establish a good pattern where the devX allows to define retrievers, models, etc. and then reference them down the call stack. The issue is that it requires that the developer passes the references to these actions throughout the code and I'm not a huge fan of that. Using a string you can define/configure the retriever in one spot/file/package and then just reference it from another without having to deal with obtaining the reference. I'm not sure if there's a clean way to support both in Go... in JS the union types make it easy string | ModelReference | ModelAction.

jba requested review from randall77, ianlancetaylor, pavelgj, ssbushi and apascal07 May 31, 2024 12:29

pavelgj reviewed May 31, 2024

View reviewed changes

pavelgj approved these changes May 31, 2024

View reviewed changes

ianlancetaylor approved these changes May 31, 2024

View reviewed changes

jba merged commit 8a65fea into main May 31, 2024
5 checks passed

jba deleted the jba-define-ret branch May 31, 2024 20:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Go] retrieve and index calls are always actions #291

[Go] retrieve and index calls are always actions #291

jba commented May 31, 2024

jba commented May 31, 2024

ssbushi commented May 31, 2024

pavelgj May 31, 2024

pavelgj commented May 31, 2024

[Go] retrieve and index calls are always actions #291

[Go] retrieve and index calls are always actions #291

Conversation

jba commented May 31, 2024

jba commented May 31, 2024

ssbushi commented May 31, 2024

pavelgj May 31, 2024

Choose a reason for hiding this comment

pavelgj commented May 31, 2024