Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from google_generative_ai package to firebase_vertexai (and from BYO API key to BYO Firebase project) #53

Closed
MrCsabaToth opened this issue Sep 19, 2024 · 12 comments
Assignees
Labels
enhancement New feature or request function tooling Generative AI Function / Tooling related multi modal Multi Modality related RAG Retrieval Augmented Generation related

Comments

@MrCsabaToth
Copy link
Member

We experimented with https://pub.dev/packages/firebase_vertexai/ regarding multilingual embedding #48. That didn't come to fruition (google-gemini/generative-ai-dart#209 and firebase/flutterfire#13269), however lately it also turned out that https://github.com/google-gemini/generative-ai-dart/ doesn't support file upload (google-gemini/generative-ai-dart#211 and google-gemini/generative-ai-dart#70). This is crucial because audio and video multi modalities (and possibly also PDF and others except image) need file upload instead of inline data (#38 and https://discuss.ai.google.dev/t/gemini-1-5-refuses-to-process-audio-files/39713/5?u=tocsa).

Firebase offers file upload unrelated to AI for a long time now, so we'll make the leap of faith and convert over. For someone to kickstart and replicate this project would need to establish two cloud functions anyway (for Chirp / STT and TTS), so they'd need to deal with more than just an AI Studio (ex MakerSuite) API Key. With multilingual embedding and reranking we'll have two more cloud functions and establishing this will be just simpler on Firebase then in the "big boy" vertex AI (you know someone needs to establish service accounts, roles and all nine yards).

@MrCsabaToth MrCsabaToth added the enhancement New feature or request label Sep 19, 2024
@MrCsabaToth MrCsabaToth self-assigned this Sep 19, 2024
@MrCsabaToth
Copy link
Member Author

This is the way! google-gemini/generative-ai-dart#70 (comment)
Gemini_Generated_Image_34cwwt34cwwt34cw

MrCsabaToth added a commit that referenced this issue Sep 21, 2024
…salign Firebase registrations, and also lead to build error:

Execution failed for task ':app:processDevelopmentDebugGoogleServices'. No matching client found for package name dev.csaba.inspector_gadget.dev /android/app/google-services.json
MrCsabaToth added a commit that referenced this issue Sep 21, 2024
Currently battling with [firebase_functions/internal] Response is not valid JSON object.
errorCode: firebase_functions
errorMessage: com.google.firebase.functions.FirebaseFunctionsException: Response is not valid JSON object.
@MrCsabaToth
Copy link
Member Author

Note that this is against the direction of off-device working, but Firestore now supports vector DB and vector search: https://cloud.google.com/firestore/docs/vector-search

Also note that we perform a dimensionality reduction with folding (instead of truncation) which currently leads to non normalized vectors. This means that dot product (the potentially most cost effective distance) is not a valid distance any more https://cloud.google.com/firestore/docs/vector-search#choose-distance-measure
So maybe we should normalize after the folding?

MrCsabaToth added a commit that referenced this issue Sep 22, 2024
… JSON content inside a 'data' field, and it also expects it back that way! #53
MrCsabaToth added a commit that referenced this issue Sep 22, 2024
… JSON content inside a 'data' field, and it also expects it back that way! #53
@MrCsabaToth
Copy link
Member Author

TODO: enforce App Check on functions, convert them? https://firebase.google.com/docs/app-check/cloud-functions?hl=en

@MrCsabaToth
Copy link
Member Author

Dealing with two errors right now:

  1. "The caller does not have permission" server side fail when trying to handle modalities persisted in Firebase Storage
  2. "Please ensure that function call turn comes immediately after a user turn or after a function response turn." when trying function calls

So far many steps back compared to https://pub.dev/packages/google_generative_ai, many lost features!

@MrCsabaToth
Copy link
Member Author

MrCsabaToth commented Sep 22, 2024

@MrCsabaToth
Copy link
Member Author

After temporarily granting all read access I got a different error: "Service agents are being provisioned (https://cloud.google.com/vertex-ai/docs/general/access-control#service-agents). Service agents are needed to read the Cloud Storage file provided. So please try again in a few minutes."

@MrCsabaToth
Copy link
Member Author

I also ran into "Unable to submit request because function parameters schema should be of type OBJECT. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling" but I refactored to eliminate the two Local Tool: Location and HRM which were the only one not having SchemaType.Object: google-gemini/generative-ai-dart#194 (comment)

@MrCsabaToth
Copy link
Member Author

MrCsabaToth commented Sep 22, 2024

Even after I manually provide Storage Object Viewer rights to the AI Platform / Vertex AI service agent, I get "The caller does not have permission":

Go to the GCP Storage page related to the Firebase Storage: https://console.cloud.google.com/storage/browser/{PROJECT_NAME}.appspot.com

  1. Go to the Permissions tab
  2. Under the View Principals tab click Grant Access
  3. Under the Add Principals in the New principals field type service-{PROJECT_NUMBER}@gcp-sa-aiplatform.iam.gserviceaccount.com
  4. The principal will be singled oout and found, click on the found principal
  5. In the roles type "aiplatform.serviceAgent", click on the found role
  6. Click Save
  7. Add Storage Object Viewer Role to that service account.

So far I add read access to the public as a workaround.

@MrCsabaToth MrCsabaToth added RAG Retrieval Augmented Generation related multi modal Multi Modality related function tooling Generative AI Function / Tooling related labels Sep 22, 2024
@MrCsabaToth
Copy link
Member Author

Finally Google released -002 stable production models of gemini-1.5-flash and gemini-1.5-pro, and since Dart Firebase generative AI package relies on stable versions only we can now leverage updated capabilities compared to the May -001 releases. For more see https://www.linkedin.com/posts/chandraai_gemini-vertex-tpu-activity-7244512521435406336-vo5I and https://www.linkedin.com/posts/chandraai_gemini-genai-aiforbusiness-activity-7244548489437704193-qlvN

@MrCsabaToth
Copy link
Member Author

MrCsabaToth commented Sep 30, 2024

I switched over to the -002 models explicitly, and the function calling behavior changed. In the past (and on the submission demo) I can simply ask "What will be the weather tomorrow" or "What will be the weather next week". The model assumed (correctly) that I implicitly meant the weather at my current location, and compared to the current date/time.

The new model is very specific and picky, it doesn't think anything implied. It asks if I'll stay at my current location tomorrow (or next week) to answer the question, and it also doesn't seem to be aware of the current date / time, so that needs to be stuffed into the prompt.

MrCsabaToth added a commit that referenced this issue Sep 30, 2024
…ent optional #53

This removes ugly bubbling through the function calling facilities of latlon and HR
MrCsabaToth added a commit that referenced this issue Sep 30, 2024
…AI service account to access the Cloud Storage #53 #38
MrCsabaToth added a commit that referenced this issue Sep 30, 2024
MrCsabaToth added a commit that referenced this issue Sep 30, 2024
MrCsabaToth added a commit that referenced this issue Sep 30, 2024
…after a user turn or after a function response turn." problem #53 #56
@MrCsabaToth
Copy link
Member Author

The model was able to to reflect what's on an image (this time the image was passed with a storage gs URL after an upload instead of passing the payload over with the API call), but then right away it continued "I cannot process images yet". Extremely weird. This is the 002 new version. The Flash still does some reflection (saw it was a Ms Fields cookie), the Pro model flat out refuses to say anything about it.

@MrCsabaToth
Copy link
Member Author

The reranking #39 is not in place yet but the other mechanism are converted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request function tooling Generative AI Function / Tooling related multi modal Multi Modality related RAG Retrieval Augmented Generation related
Projects
None yet
Development

No branches or pull requests

1 participant