-
Notifications
You must be signed in to change notification settings - Fork 758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VS-286 Support mixed-mode local bindings for Vectorize in Miniflare #6916
Conversation
🦋 Changeset detectedLatest commit: 2559ae6 The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
A wrangler prerelease is available for testing. You can install this latest build in your project with: npm install --save-dev https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/11503159617/npm-package-wrangler-6916 You can reference the automatically updated head of this PR with: npm install --save-dev https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/prs/6916/npm-package-wrangler-6916 Or you can use npx https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/11503159617/npm-package-wrangler-6916 dev path/to/script.js Additional artifacts:npx https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/11503159617/npm-package-create-cloudflare-6916 --no-auto-update npm install https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/11503159617/npm-package-cloudflare-kv-asset-handler-6916 npm install https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/11503159617/npm-package-miniflare-6916 npm install https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/11503159617/npm-package-cloudflare-pages-shared-6916 npm install https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/11503159617/npm-package-cloudflare-vitest-pool-workers-6916 npm install https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/11503159617/npm-package-cloudflare-workers-editor-shared-6916 npm install https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/11503159617/npm-package-cloudflare-workers-shared-6916 npm install https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/11503159617/npm-package-cloudflare-workflows-shared-6916 Note that these links will no longer work once the GitHub Actions artifact expires.
Please ensure constraints are pinned, and |
617daeb
to
b3c5862
Compare
What’s the reasoning behind disabling write in local mode? Doesn’t seem very useful without ability to write |
@cloudkite Both D1 and R2 operate on local instances that are separate from prod. Since we're directly connecting to the prod Vectorize instance instead, we definitely don't want to catch people off guard and make them cause changes in prod while testing "locally". We're considering re-enabling write operations behind an optional flag, and I think that's probably the best option. |
@garrettgu10 makes sense in terms of accidentally writing to a live db instance. However having trouble imaging how a readonly "local" db will be useful for development/testing purposes? |
@cloudkite we are not (yet) launching a local simulation database for vectorize. All requests are sent to the prod database in local dev mode. |
@garrettgu10 yup I understood that part. I just meant if I can’t write I’ll still need to use —remote How about when wrangler is in local mode it refuses to bind unless the index name has suffix of “-local-dev” or similar, that would make it very unlikely for someone to accidentally bind to their live/prod vectorize index |
This is not the direction we want to go at the moment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make sure to update these docs when this lands:
Hey all! 👋 I'm excited to see support for Vectorize in a local mode, but I'm a little sad to see this isn't a fully-local implementation. Whilst there's precedent for proxying to remote services with Workers AI, Vectorize is a stateful service. As @garrettgu10 mentioned, this encourages developers to reuse their production indices in development, which is nice in some scenarios, but contrasts to every other Cloudflare data store. This will undoubtedly cause confusion for users, and should ideally be part of a larger mixed-mode story. This confusion would be compounded if a fully-local implementation of Vectorize was built in the future, as you'd likely want to enable this by default to match other data stores. Arguably the main reason local support for Workers AI is provided by a remote proxy is the large hardware requirements to run the models, both in terms of storage and compute. These requirements don't apply to Vectorize. When building local simulators, the primary concern is correctly implementing the production semantics, not performance. The size of datasets is significantly smaller in local mode. This enables you to build much simpler implementations of the services. It should be possible to build a Vectorize simulator with brute force vector similarity on top of the SQLite API provided by Durable Objects, similar to the other simulators. You could explore enabling an SQLite vector index extension in A fully-local simulator would also allow Vectorize to be used with the Workers Vitest pool. I know I don't work at Cloudflare anymore and I'm sure there's discussions about this internally, but I hope you consider these points. Again, I'm very excited to see local mode development, I just want to make sure the user experience is consistent. 🙂 |
@mrbbot Thanks for your comments, they've been super helpful internally and we've discussed them at length. I agree that a local simulator is the superior solution re. DX and seems fairly simple to implement as a first-pass, but the team doesn't have the bandwidth for maintaining such a solution at the moment, esp. wrt. feature-parity with updates to the service. With that in mind, I think the best compromise is to allow users to bind to production indexes through an optional flag such as |
@garrettgu10 This feels like a pretty big break from how other bindings are implemented, and I'd be concerned about adding an If there aren't any fundamental limits on getting Vectorize working locally, would it be possible to revisit that discussion? Do you have a link to a decision doc or anything around this? |
92e22ea
to
0ee1b16
Compare
42d1144
to
1e504fd
Compare
bb397ac
to
b55b895
Compare
9d247e7
to
3fca937
Compare
3fca937
to
0fe9064
Compare
68846c0
to
17e6939
Compare
17e6939
to
2559ae6
Compare
@@ -579,7 +640,6 @@ describe.sequential.each(RUNTIMES)("Bindings: $flags", ({ runtime, flags }) => { | |||
it.skipIf(isLocal).todo("exposes send email bindings"); | |||
it.skipIf(isLocal).todo("exposes browser bindings"); | |||
it.skipIf(isLocal).todo("exposes Workers AI bindings"); | |||
it.skipIf(isLocal).todo("exposes Vectorize bindings"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
What this PR solves / how to test
Fixes VS-286.
Author has addressed the following