Date: Wed, 18 Sep 2024 05:26:54 +0200
Subject: [PATCH 03/10] chore(deps): update dependency vite to v5.4.6
[security] (#1115)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This PR contains the following updates:
| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [vite](https://vitejs.dev)
([source](https://redirect.github.com/vitejs/vite/tree/HEAD/packages/vite))
| [`5.4.3` ->
`5.4.6`](https://renovatebot.com/diffs/npm/vite/5.4.3/5.4.6) |
[![age](https://developer.mend.io/api/mc/badges/age/npm/vite/5.4.6?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/npm/vite/5.4.6?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/npm/vite/5.4.3/5.4.6?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/npm/vite/5.4.3/5.4.6?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
---
> [!WARNING]
> Some dependencies could not be looked up. Check the warning logs for
more information.
### GitHub Vulnerability Alerts
####
[CVE-2024-45811](https://redirect.github.com/vitejs/vite/security/advisories/GHSA-9cwx-2883-4wfx)
### Summary
The contents of arbitrary files can be returned to the browser.
### Details
`@fs` denies access to files outside of Vite serving allow list. Adding
`?import&raw` to the URL bypasses this limitation and returns the file
content if it exists.
### PoC
```sh
$ npm create vite@latest
$ cd vite-project/
$ npm install
$ npm run dev
$ echo "top secret content" > /tmp/secret.txt
# expected behaviour
$ curl "http://localhost:5173/@fs/tmp/secret.txt"
403 Restricted
The request url "/tmp/secret.txt" is outside of Vite serving allow list.
# security bypassed
$ curl "http://localhost:5173/@fs/tmp/secret.txt?import&raw"
export default "top secret content\n"
//# sourceMappingURL=data:application/json;base64,eyJ2...
```
---
### Release Notes
vitejs/vite (vite)
###
[`v5.4.6`](https://redirect.github.com/vitejs/vite/releases/tag/v5.4.6)
[Compare
Source](https://redirect.github.com/vitejs/vite/compare/v5.4.5...v5.4.6)
Please refer to
[CHANGELOG.md](https://redirect.github.com/vitejs/vite/blob/v5.4.6/packages/vite/CHANGELOG.md)
for details.
###
[`v5.4.5`](https://redirect.github.com/vitejs/vite/releases/tag/v5.4.5)
[Compare
Source](https://redirect.github.com/vitejs/vite/compare/v5.4.4...v5.4.5)
Please refer to
[CHANGELOG.md](https://redirect.github.com/vitejs/vite/blob/v5.4.5/packages/vite/CHANGELOG.md)
for details.
###
[`v5.4.4`](https://redirect.github.com/vitejs/vite/releases/tag/v5.4.4)
[Compare
Source](https://redirect.github.com/vitejs/vite/compare/v5.4.3...v5.4.4)
Please refer to
[CHANGELOG.md](https://redirect.github.com/vitejs/vite/blob/v5.4.4/packages/vite/CHANGELOG.md)
for details.
---
### Configuration
📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).
🚦 **Automerge**: Enabled.
♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.
🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.
---
- [ ] If you want to rebase/retry this PR, check
this box
---
This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/GoogleCloudPlatform/generative-ai).
Co-authored-by: Kristopher Overholt
---
conversation/chat-app/package-lock.json | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/conversation/chat-app/package-lock.json b/conversation/chat-app/package-lock.json
index eb501b55311..c36f273ab0a 100644
--- a/conversation/chat-app/package-lock.json
+++ b/conversation/chat-app/package-lock.json
@@ -2776,9 +2776,9 @@
"dev": true
},
"node_modules/vite": {
- "version": "5.4.3",
- "resolved": "https://registry.npmjs.org/vite/-/vite-5.4.3.tgz",
- "integrity": "sha512-IH+nl64eq9lJjFqU+/yrRnrHPVTlgy42/+IzbOdaFDVlyLgI/wDlf+FCobXLX1cT0X5+7LMyH1mIy2xJdLfo8Q==",
+ "version": "5.4.6",
+ "resolved": "https://registry.npmjs.org/vite/-/vite-5.4.6.tgz",
+ "integrity": "sha512-IeL5f8OO5nylsgzd9tq4qD2QqI0k2CQLGrWD0rCN0EQJZpBK5vJAx0I+GDkMOXxQX/OfFHMuLIx6ddAxGX/k+Q==",
"dev": true,
"license": "MIT",
"dependencies": {
From 9fb19e71e493415f9ea9352913f7235001d38d82 Mon Sep 17 00:00:00 2001
From: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Date: Wed, 18 Sep 2024 10:22:27 -0500
Subject: [PATCH 04/10] ci: Clear "expect.txt" for check-spelling (#1130)
---
.github/actions/spelling/expect.txt | 914 ----------------------------
1 file changed, 914 deletions(-)
diff --git a/.github/actions/spelling/expect.txt b/.github/actions/spelling/expect.txt
index 862dc503f24..e69de29bb2d 100644
--- a/.github/actions/spelling/expect.txt
+++ b/.github/actions/spelling/expect.txt
@@ -1,914 +0,0 @@
-aaae
-aaf
-aaxis
-abcc
-acb
-accura
-acf
-Adedeji
-Adidas
-ADMA
-advanc
-agentic
-AGG
-AGs
-ainsi
-aip
-aiplatform
-akka
-Akkaoui
-Aktu
-alcuna
-allowi
-alloydb
-AlphaFold
-Amarilli
-analysisremote
-Aniston
-anonymization
-antiword
-APAC
-apges
-applehelp
-appuser
-Arborio
-Arsan
-artifactregistry
-Artsakh
-arxiv
-asarray
-ASF
-assurent
-astype
-asyncmock
-Atticus
-automerge
-autopush
-autorater
-autosummary
-autosxs
-autotuning
-autres
-autrui
-backticks
-baco
-Baggins
-Barclays
-barmode
-barpolar
-baxis
-bbc
-bbf
-bboxes
-bcdfd
-beginnen
-beim
-bella
-bellow
-belov'd
-bfa
-Biden
-bigframes
-bigquery
-bigqueryconnection
-bigquerystorage
-bigserial
-Bigtable
-bioenergy
-Bitcoin
-bleiben
-blogposts
-blogs
-bornes
-Borregas
-boulier
-Boyz
-bpd
-bqdf
-bqml
-branchess
-bucketname
-Buckleys
-Buffay
-butta
-Caldara
-CALIPSO
-carbonara
-Carlessian
-catus
-caxis
-ccbf
-cccbd
-cctemplate
-cdad
-cdc
-cdk
-ceb
-cec
-celle
-certifi
-ces
-cet
-Ceux
-chaque
-chatbots
-chatbox
-Chawla
-CHECKOV
-Cheeseman
-Chicxulub
-chipset
-Chocolat
-choosi
-chromadb
-Cinemark
-ckpt
-clearn
-CLIs
-cloudapis
-cloudbuild
-cloudfunction
-cloudkms
-cloudresourcemanager
-cloudrun
-cloudsql
-cloudveil
-cmap
-codebases
-codechat
-codefile
-codelab
-CODEOWNERS
-coderag
-codey
-colab
-Collider
-Colm
-coloraxis
-colorbar
-colorway
-colwidth
-conching
-concourir
-consiste
-consts
-continute
-Contly
-contraint
-cookin
-cosa
-coupable
-coveragerc
-cpet
-crowdsourcing
-crudele
-csa
-cse
-CUAD
-cultura
-currentcolor
-CVD
-CVS
-cygpath
-d'adorarvi
-d'une
-dans
-danza
-darkgrid
-dataform
-dataframe
-datapoints
-dbln
-dcfd
-DCG
-ddc
-ddl
-debian
-deconflict
-defb
-Deleece
-demeurent
-demouser
-dente
-Depatmint
-descgen
-devhelp
-devrel
-DHH
-Dialogflow
-Diarization
-dicesti
-diese
-diesen
-Digala
-directlt
-direnv
-discoveryengine
-disperar
-Disturbia
-Dload
-dlp
-docai
-DOCDB
-docfx
-dockerpush
-docstore
-doctrees
-documentai
-doivent
-Donya
-Dreesen
-driv
-dropna
-DRV
-dtype
-Durafast
-Durmus
-DVDs
-Dzor
-ebbb
-ecommerce
-EDB
-edfc
-Edunov
-effici
-EHR
-EIP
-ekg
-Elimende
-elinks
-elles
-emb
-embeddings
-embvs
-EMEA
-EMFU
-EMNLP
-emplois
-Emul
-emulsif
-encanta
-endblock
-endedness
-endlocal
-eneration
-enterpriseknowledgegraph
-enterprisesearch
-envrc
-epath
-Epc
-Eragon
-erally
-erlang
-erty
-erwinh
-ESG
-Esin
-essen
-etags
-etf
-etils
-euch
-EUR
-evals
-eventarc
-exercer
-expl
-faiss
-fanciulla
-faqs
-fastapi
-fcc
-fda
-fewshot
-FFL
-fiero
-figsize
-fillmode
-fillna
-finall
-firebaserc
-firestore
-Fishburne
-fixmycar
-fixmycarbackend
-flac
-Flipkart
-flowbite
-fmeasure
-FMLA
-foco
-followups
-Folmer
-footwell
-forno
-freind
-froma
-fromarray
-fromiter
-FSD
-fss
-fulltext
-fullurl
-fullwidth
-functiona
-functiondef
-furter
-fuzzywuzzy
-Gatace
-gbq
-gcf
-gcloud
-gcp
-gcs
-GDC
-geht
-genai
-genappbuilder
-Genomics
-GenTwo
-genwealth
-geocoded
-getconn
-getexif
-gidiyor
-Giordani
-Gisting
-gitleaks
-gitpython
-gke
-glusr
-Godzilla
-Gonggong
-Googl
-googleapiclient
-googlecloud
-gpg
-gpt
-gpu
-gradio
-gradlew
-gramm
-grammer
-gridcolor
-grocerybot
-grpcio
-gserviceaccount
-gsm
-gspread
-gsutil
-guanciale
-gunicorn
-hadolint
-Hamers
-hashicorp
-hashtag
-Haumea
-hdlr
-heatmap
-heatmapgl
-HEPASKY
-hexsha
-Hickson
-hnsw
-hoffe
-Hogwarts
-Holog
-holtskinner
-HOMEDRIVE
-HOMEPATH
-hommes
-hovermode
-htmlhelp
-htmlhintrc
-Hubmann
-HZN
-iban
-idk
-IFI
-ifidx
-iloc
-ils
-imagegeneration
-imageno
-imagesearch
-imagetext
-imdb
-immagine
-imshow
-Inagella
-inbox
-indexvalue
-individu
-ingre
-ingredie
-inputtext
-instru
-instruc
-Inte
-intersphinx
-invo
-Iosif
-ipykernel
-ipynb
-ipywidgets
-IRAs
-isq
-italiana
-ITDMs
-iterrows
-ivf
-ivfflat
-ixed
-J'aime
-javac
-JAVACMD
-Jax
-JBEAP
-jdk
-Jedi
-jegadesh
-JHome
-jiwer
-jpa
-jre
-jsonify
-jumpstart
-junitxml
-jupyter
-jusqu
-kaggle
-Kalamang
-Kamradt
-kann
-Keanu
-Keown
-keras
-Keyb
-Keychain
-KFBI
-kgs
-KHTML
-kickstart
-Knative
-KPIs
-KSA
-Kudrow
-l'anglais
-l'exercice
-Ladhak
-lakecolor
-Lalit
-landcolor
-langchain
-Lasst
-lastrequest
-lastresponse
-LCEL
-lego
-Legrenzi
-lengh
-leur
-Leute
-lexer
-ligh
-linalg
-linecolor
-linestyle
-linting
-Liquicap
-listdir
-llm
-logits
-Logrus
-loguru
-loi
-lolcat
-lon
-LOOKBACK
-Lottry
-LPH
-lsb
-LSum
-lxml
-Mager
-magika
-mai
-Maillard
-Makemake
-mapbox
-marb
-maskmode
-matchingengine
-mavenrc
-mbsdk
-mdc
-mec
-mediterraneansea
-membres
-meme
-Memorystore
-Mercari
-metadatas
-metageneration
-MFU
-MICOA
-millisecs
-Mindf
-miniforge
-Minuetto
-mio
-mmr
-Molaison
-morire
-moto
-Mpa
-mpe
-mpld
-mrag
-mtu
-multimodalembedding
-multitool
-mvn
-mvnw
-myaccount
-mydb
-mydomain
-myprojectid
-myvertexdatastoreid
-n'ordonne
-naissent
-Nakoso
-nanswer
-Narnia
-nas
-naturels
-nazione
-nbconvert
-nbformat
-nbqa
-nce
-ncols
-ndarray
-NDCG
-neces
-netif
-networkmanagement
-Neue
-nio
-nlp
-nltk
-nobserved
-nodularis
-Nominatim
-norigin
-noticeab
-noxfile
-nrows
-ntheory
-nuisibles
-nuit
-nvidia
-Nyquil
-ocr
-ODb
-OLAP
-oliv
-olleh
-openai
-openfda
-Orbelians
-ordre
-orgpolicy
-ori
-originalname
-oslogin
-osm
-OTCH
-owlbot
-pagemap
-Pakeman
-paleo
-pancetta
-Paolini
-Paquete
-parcoords
-Pati
-payslip
-paystub
-PDEs
-pdfminer
-pdfplumber
-pdfs
-peines
-personne
-petabytes
-peut
-peuvent
-pgadmin
-PGDATABASE
-PGHOST
-PGPORT
-PGUSER
-pgvector
-photorealistic
-Pichai
-pii
-pincodes
-pixmap
-pkl
-plac
-playlists
-plc
-plotly
-PLOTLYENV
-plpgsql
-pls
-plt
-poissons
-posargs
-posso
-postgres
-postgresql
-pourvu
-pouvoir
-prcntg
-preds
-prepari
-prerel
-pretrained
-prewritten
-proactively
-Procfile
-programar
-PROJECTBASEDIR
-projectid
-proname
-protobuf
-psa
-pstotext
-psychographics
-Pullum
-puni
-punisse
-pyasn
-Pydanitc
-pydantic
-pydub
-pymupdf
-pyopenssl
-pypdf
-pyplot
-pytesseract
-PYTHONUNBUFFERED
-pytorch
-pyupgrade
-qna
-QPM
-qthelp
-qu'elle
-Quaoar
-qubit
-questa
-Qwiklab
-ragdemos
-raggio
-rarian
-ratelimit
-receieve
-recommonmark
-regexes
-Reimagining
-rekenrek
-rembg
-remoting
-REPOURL
-reprompt
-requestz
-reranking
-resourced
-resourcemanager
-resul
-Reza
-ribeye
-ricc
-riccardo
-RLHF
-Roboto
-Ruchika
-runjdwp
-RYDE
-Sahel
-saisi
-Sauron
-Sca
-scattercarpet
-scattergeo
-scattergl
-scattermapbox
-scatterpolar
-scatterpolargl
-scatterternary
-Schlenoff
-Schwimmer
-sco
-screencast
-screenshots
-seaborn
-seatback
-Sebben
-seby
-secretmanager
-Sedna
-SEK
-Selam
-selfie
-selon
-sentenc
-seo
-seperate
-serait
-serializinghtml
-serviceaccount
-servicedirectory
-servicenetworking
-serviceusage
-setlocal
-sft
-Shklovsky
-shortdesc
-showlakes
-showland
-showor
-showtime
-Shubham
-siglap
-simage
-Siri
-sittin
-Skaffold
-sklearn
-sku
-slf
-smartphone
-Smaug
-SNB
-SNE
-snowfakery
-sociales
-soit
-solutionbuilder
-sono
-sont
-Speci
-sphinxcontrib
-spirito
-Sprachen
-sprechen
-springframework
-sqlalchemy
-sqlfluff
-ssd
-ssml
-ssn
-stackoverflow
-stakeholders
-stcore
-stext
-Stic
-STIX
-stp
-streamlit
-stru
-stt
-stylelintrc
-Subworkflow
-summ
-Sundar
-Superstore
-synthtool
-systemtest
-sytem
-Syunik
-TABLESPACE
-tagline
-tailwindcss
-Tast
-templatefile
-temurin
-tensorboard
-tensorflow
-terraform
-testutils
-textembedding
-texting
-textno
-texttospeech
-tfhub
-tftpl
-thelook
-THREDED
-thres
-thsoe
-tiangolo
-tiendrons
-tiktoken
-timechart
-timecode
-TLDR
-tobytes
-Tolkien
-tomat
-Tomoko
-topk
-toself
-tous
-toute
-tpu
-tqdm
-tran
-Tribbiani
-trustedtester
-tsne
-tsv
-tts
-typehints
-UBS
-UDFs
-UEFA
-und
-Undeploying
-undst
-unigram
-unrtf
-Unsplash
-uomo
-Urs
-usebackq
-usecases
-Utik
-uvicorn
-vais
-Vayots
-VDF
-vectoral
-vectorsearch
-vectorstore
-vedi
-Vergin
-Verilog
-vertexai
-vertexdatastoreid
-viai
-viewcode
-Vodafone
-vous
-vpcaccess
-VQA
-VSC
-vtpm
-vulnz
-wdir
-webclient
-webinar
-webpage
-websites
-Wehn
-weil
-welcom
-Wellcare
-werkzeug
-wikilingua
-wikipediaapi
-wil
-Willibald
-wip
-WORKDIR
-wth
-xaxes
-xaxis
-xdg
-Xferd
-xlabel
-xsi
-Xsrf
-xsum
-xticks
-xxxxxxx
-xxxxxxxx
-xxxxxxxxxx
-yaxes
-yaxis
-yeux
-ylabel
-yourselfers
-youtube
-ytd
-yticks
-zakarid
-zaxis
-zdq
-Zom
-Zootopia
-Zscaler
-Zuercher
From 5ee7d80028a6a13b75efdafec25838f1fdb783d3 Mon Sep 17 00:00:00 2001
From: Kartik Chaudhary
Date: Wed, 18 Sep 2024 21:40:28 +0530
Subject: [PATCH 05/10] feat: add moviepy, fontdict to allow.txt (#1131)
---
.github/actions/spelling/allow.txt | 2 ++
1 file changed, 2 insertions(+)
diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt
index 92842866971..6dc93eb8c97 100644
--- a/.github/actions/spelling/allow.txt
+++ b/.github/actions/spelling/allow.txt
@@ -343,6 +343,7 @@ ffi
figsize
fillmode
flac
+fontdict
forno
freedraw
freopen
@@ -441,6 +442,7 @@ metadatas
mgrs
miranda
morty
+moviepy
mpn
mrr
nbconvert
From cbc49cc90875655dcd26647835edfcb41c6b2caf Mon Sep 17 00:00:00 2001
From: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Date: Wed, 18 Sep 2024 11:18:32 -0500
Subject: [PATCH 06/10] ci: Add Autoflake to `nox -s format` (#1100)
# Description
Ran autoflake and pyupgrade
---------
Co-authored-by: Owl Bot
---
.github/actions/spelling/allow.txt | 19 ++++++
.github/actions/spelling/excludes.txt | 2 +
.../workflows/issue_assigner/assign_issue.py | 2 +-
.github/workflows/update_notebook_links.py | 5 +-
gemini/function-calling/sql-talk-app/app.py | 4 +-
.../app/pages_utils/downloads.py | 2 +-
.../app/pages_utils/edit_image.py | 2 +-
.../app/pages_utils/imagen.py | 3 +-
.../app/pages_utils/insights.py | 3 +-
.../cloud_functions/gemini_call/main.py | 6 +-
.../cloud_functions/imagen_call/main.py | 4 +-
.../cloud_functions/text_embedding/main.py | 4 +-
.../pages/3_Graph_Visualization.py | 2 +-
.../fixmycar/frontend/streamlit-backend.py | 2 +-
.../gemini-streamlit-cloudrun/app.py | 5 +-
.../function-scripts/process-pdf/main.py | 17 +++--
.../update-search-index/main.py | 7 +--
...0\237\227\204\357\270\217 Data Sources.py" | 2 +-
.../photo-discovery/ag-web/app/app.py | 7 +--
.../utils/intro_multimodal_rag_utils.py | 63 ++++++++++---------
.../document-qa/utils/matching_engine.py | 59 ++++++++---------
.../utils/matching_engine_utils.py | 5 +-
noxfile.py | 49 +++++++++------
owlbot.py | 2 +-
search/cloud-function/python/main.py | 8 +--
.../test_integration_vertex_search_client.py | 2 +-
.../python/vertex_ai_search_client.py | 28 ++++-----
search/web-app/ekg_utils.py | 12 ++--
search/web-app/genappbuilder_utils.py | 33 +++++-----
29 files changed, 190 insertions(+), 169 deletions(-)
diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt
index 6dc93eb8c97..f9346345d34 100644
--- a/.github/actions/spelling/allow.txt
+++ b/.github/actions/spelling/allow.txt
@@ -107,6 +107,7 @@ Jang
Jedi
Joji
KNN
+KPIs
Kaelen
Kaggle
Kamradt
@@ -248,6 +249,7 @@ Womens
XXE
Yuzuru
Zijin
+Zom
Zscaler
Zuercher
aadd
@@ -261,6 +263,7 @@ afrom
agentic
ainit
ainvoke
+aip
airlume
alloydb
antiword
@@ -271,6 +274,7 @@ arXiv
aretrieve
arun
astype
+autoflake
autogen
automl
autoptr
@@ -306,6 +310,7 @@ colwidth
constexpr
corpuses
csa
+cse
cupertino
dask
dataframe
@@ -320,6 +325,7 @@ deskmates
dino
diy
docai
+docstore
dpi
draig
drinkware
@@ -328,7 +334,9 @@ dsl
dtypes
dwmapi
ecommerce
+ekg
elous
+emb
embs
emojis
ename
@@ -337,17 +345,20 @@ etf
eur
evals
faiss
+fastapi
fect
fewshot
ffi
figsize
fillmode
+firestore
flac
fontdict
forno
freedraw
freopen
fromarray
+fromiter
fts
fulltext
funtion
@@ -395,6 +406,7 @@ idk
idks
idxs
iloc
+imageno
imdb
imshow
iostream
@@ -406,6 +418,7 @@ itables
iterrows
jegadesh
jetbrains
+jsonify
jupyter
kaggle
kenleejr
@@ -466,6 +479,7 @@ onesies
osx
owlbot
oxml
+pagemap
paleo
pancetta
pantarba
@@ -497,6 +511,7 @@ projectid
protobuf
pstotext
pubspec
+putdata
pvc
pyautogen
pybind
@@ -552,6 +567,7 @@ sxs
tagline
tencel
termcolor
+textno
tfhub
tfidf
tgz
@@ -559,14 +575,17 @@ thelook
tiktoken
timechart
titlebar
+tobytes
toself
tqdm
tritan
ubuntu
+undst
unigram
unrtf
upsell
urandom
+usecases
username
usernames
uvb
diff --git a/.github/actions/spelling/excludes.txt b/.github/actions/spelling/excludes.txt
index 02d685a97ac..b551f438713 100644
--- a/.github/actions/spelling/excludes.txt
+++ b/.github/actions/spelling/excludes.txt
@@ -107,3 +107,5 @@ ignore$
^\Qowlbot.py\E$
^\Qsearch/bulk-question-answering/bulk_question_answering_output.tsv\E$
^\Q.github/workflows/issue_assigner/assign_issue.py\E$
+^\Qnoxfile.py\E$
+^\owlbot.py\E$
diff --git a/.github/workflows/issue_assigner/assign_issue.py b/.github/workflows/issue_assigner/assign_issue.py
index e489f31e4d8..3360d20be7b 100644
--- a/.github/workflows/issue_assigner/assign_issue.py
+++ b/.github/workflows/issue_assigner/assign_issue.py
@@ -12,7 +12,7 @@
def get_issue_number(event_path: str) -> int:
"""Retrieves the issue number from GitHub event data."""
# Load event data
- with open(event_path, "r", encoding="utf-8") as f:
+ with open(event_path, encoding="utf-8") as f:
event_data = json.load(f)
# Determine the issue number based on the event
diff --git a/.github/workflows/update_notebook_links.py b/.github/workflows/update_notebook_links.py
index b8f75e2357d..c9d7852f976 100644
--- a/.github/workflows/update_notebook_links.py
+++ b/.github/workflows/update_notebook_links.py
@@ -2,7 +2,6 @@
import os
import sys
-from typing import Tuple
import urllib.parse
import nbformat
@@ -21,7 +20,7 @@
def fix_markdown_links(
cell_source: str, relative_notebook_path: str
-) -> Tuple[str, bool]:
+) -> tuple[str, bool]:
"""Fixes links in a markdown cell and returns the updated source."""
new_lines = []
changes_made = False
@@ -58,7 +57,7 @@ def fix_markdown_links(
def fix_links_in_notebook(notebook_path: str) -> int:
"""Fixes specific types of links in a Jupyter notebook."""
- with open(notebook_path, "r", encoding="utf-8") as f:
+ with open(notebook_path, encoding="utf-8") as f:
notebook = nbformat.read(f, as_version=4)
relative_notebook_path = os.path.relpath(notebook_path, start=os.getcwd()).lower()
diff --git a/gemini/function-calling/sql-talk-app/app.py b/gemini/function-calling/sql-talk-app/app.py
index b1be8651967..c7ecda501b0 100644
--- a/gemini/function-calling/sql-talk-app/app.py
+++ b/gemini/function-calling/sql-talk-app/app.py
@@ -115,7 +115,7 @@
for message in st.session_state.messages:
with st.chat_message(message["role"]):
- st.markdown(message["content"].replace("$", "\$")) # noqa: W605
+ st.markdown(message["content"].replace("$", r"\$")) # noqa: W605
try:
with st.expander("Function calls, parameters, and responses"):
st.markdown(message["backend_details"])
@@ -257,7 +257,7 @@
full_response = response.text
with message_placeholder.container():
- st.markdown(full_response.replace("$", "\$")) # noqa: W605
+ st.markdown(full_response.replace("$", r"\$")) # noqa: W605
with st.expander("Function calls, parameters, and responses:"):
st.markdown(backend_details)
diff --git a/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/downloads.py b/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/downloads.py
index 3eb03b88b41..66f62c6ffa4 100644
--- a/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/downloads.py
+++ b/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/downloads.py
@@ -90,7 +90,7 @@ def download_button(object_to_download: bytes, download_filename: str) -> str:
b64 = base64.b64encode(zip_content).decode()
# Read the HTML template file
- with open("app/download_link.html", "r", encoding="utf8") as f:
+ with open("app/download_link.html", encoding="utf8") as f:
html_template = f.read()
# Replace placeholders in the HTML template
diff --git a/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/edit_image.py b/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/edit_image.py
index 86a2ceacfa7..e37a5640ff5 100644
--- a/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/edit_image.py
+++ b/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/edit_image.py
@@ -121,7 +121,7 @@ def handle_image_upload() -> None:
filename = "uploaded_image0.png"
image.save(filename)
st.session_state.start_editing = True
- except (IOError, PIL.UnidentifiedImageError) as e:
+ except (OSError, PIL.UnidentifiedImageError) as e:
st.error(f"Error opening image: {e}")
diff --git a/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/imagen.py b/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/imagen.py
index 61e40db36a4..e5d9de6181c 100644
--- a/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/imagen.py
+++ b/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/imagen.py
@@ -10,7 +10,6 @@
import json
import logging
import os
-from typing import Optional
from PIL import Image
import aiohttp as cloud_function_call
@@ -87,7 +86,7 @@ def image_generation(
images[0].save(location=f"{filename}.png", include_generation_parameters=False)
-async def parallel_image_generation(prompt: str, col: int) -> Optional[Image.Image]:
+async def parallel_image_generation(prompt: str, col: int) -> Image.Image | None:
"""
Executes parallel generation of images through Imagen.
diff --git a/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/insights.py b/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/insights.py
index 9419d1a6b9e..e94af3c98e9 100644
--- a/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/insights.py
+++ b/gemini/sample-apps/accelerating_product_innovation/app/pages_utils/insights.py
@@ -11,7 +11,6 @@
import json
import os
import re
-from typing import Optional
from app.pages_utils.embedding_model import embedding_model_with_backoff
from app.pages_utils.get_llm_response import generate_gemini
@@ -77,7 +76,7 @@ def get_suggestions(state_key: str) -> None:
st.session_state[state_key] = extract_bullet_points(gen_suggestions)
-def get_stored_embeddings_as_df() -> Optional[pd.DataFrame]:
+def get_stored_embeddings_as_df() -> pd.DataFrame | None:
"""Retrieves and processes stored embeddings from cloud storage.
Returns:
diff --git a/gemini/sample-apps/accelerating_product_innovation/cloud_functions/gemini_call/main.py b/gemini/sample-apps/accelerating_product_innovation/cloud_functions/gemini_call/main.py
index 901c3edd092..b1224677278 100644
--- a/gemini/sample-apps/accelerating_product_innovation/cloud_functions/gemini_call/main.py
+++ b/gemini/sample-apps/accelerating_product_innovation/cloud_functions/gemini_call/main.py
@@ -3,7 +3,7 @@
"""
import os
-from typing import Any, Dict, Tuple, Union
+from typing import Any
from dotenv import load_dotenv
import functions_framework
@@ -41,7 +41,7 @@ def generate_text(prompt: str) -> str:
@functions_framework.http
-def get_llm_response(request: Any) -> Union[Dict, Tuple]:
+def get_llm_response(request: Any) -> dict | tuple:
"""HTTP Cloud Function that generates text using the Gemini-Pro model.
Args:
@@ -53,7 +53,7 @@ def get_llm_response(request: Any) -> Union[Dict, Tuple]:
Response object using `make_response`
.
"""
- request_json: Dict = request.get_json(silent=True)
+ request_json: dict = request.get_json(silent=True)
if not request_json or "text_prompt" not in request_json:
return {"error": "Request body must contain 'text_prompt' field."}, 400
diff --git a/gemini/sample-apps/accelerating_product_innovation/cloud_functions/imagen_call/main.py b/gemini/sample-apps/accelerating_product_innovation/cloud_functions/imagen_call/main.py
index decf23478b5..a04df1badce 100644
--- a/gemini/sample-apps/accelerating_product_innovation/cloud_functions/imagen_call/main.py
+++ b/gemini/sample-apps/accelerating_product_innovation/cloud_functions/imagen_call/main.py
@@ -3,7 +3,7 @@
"""
import os
-from typing import Any, Dict
+from typing import Any
from dotenv import load_dotenv
import functions_framework
@@ -46,5 +46,5 @@ def get_images(request: Any) -> bytes:
Returns:
Response: A Flask Response object containing the generated image.
"""
- request_json: Dict = request.get_json(silent=True)
+ request_json: dict = request.get_json(silent=True)
return image_generation(request_json["img_prompt"])
diff --git a/gemini/sample-apps/accelerating_product_innovation/cloud_functions/text_embedding/main.py b/gemini/sample-apps/accelerating_product_innovation/cloud_functions/text_embedding/main.py
index f46216b4dff..2c9de8ee79b 100644
--- a/gemini/sample-apps/accelerating_product_innovation/cloud_functions/text_embedding/main.py
+++ b/gemini/sample-apps/accelerating_product_innovation/cloud_functions/text_embedding/main.py
@@ -4,7 +4,7 @@
import json
import os
-from typing import Any, List
+from typing import Any
from dotenv import load_dotenv
import functions_framework
@@ -20,7 +20,7 @@
embedding_model = TextEmbeddingModel.from_pretrained("textembedding-gecko@003")
-def get_embeddings(instances: list[str]) -> List[List[float]]:
+def get_embeddings(instances: list[str]) -> list[list[float]]:
"""
Generates embeddings for given text.
diff --git a/gemini/sample-apps/finance-advisor-spanner/pages/3_Graph_Visualization.py b/gemini/sample-apps/finance-advisor-spanner/pages/3_Graph_Visualization.py
index a9b48f724b3..7e28ea4f50d 100644
--- a/gemini/sample-apps/finance-advisor-spanner/pages/3_Graph_Visualization.py
+++ b/gemini/sample-apps/finance-advisor-spanner/pages/3_Graph_Visualization.py
@@ -13,7 +13,7 @@
)
graph_viz.generate_graph()
-with open("graph_viz.html", "r", encoding="utf-8") as html_file:
+with open("graph_viz.html", encoding="utf-8") as html_file:
source_code = html_file.read()
components.html(source_code, height=950, width=900)
diff --git a/gemini/sample-apps/fixmycar/frontend/streamlit-backend.py b/gemini/sample-apps/fixmycar/frontend/streamlit-backend.py
index c7220f2c15d..810164eb562 100644
--- a/gemini/sample-apps/fixmycar/frontend/streamlit-backend.py
+++ b/gemini/sample-apps/fixmycar/frontend/streamlit-backend.py
@@ -12,7 +12,7 @@ def get_chat_response(user_prompt: str, messages: []) -> str:
request = {"prompt": user_prompt}
response = requests.post(backend_url + "/chat", json=request)
if response.status_code != 200:
- raise Exception("Bad response from backend: {}".format(response.text))
+ raise Exception(f"Bad response from backend: {response.text}")
return response.json()["response"]
diff --git a/gemini/sample-apps/gemini-streamlit-cloudrun/app.py b/gemini/sample-apps/gemini-streamlit-cloudrun/app.py
index 39925ef5af5..8c20fe6ad48 100644
--- a/gemini/sample-apps/gemini-streamlit-cloudrun/app.py
+++ b/gemini/sample-apps/gemini-streamlit-cloudrun/app.py
@@ -4,7 +4,6 @@
"""
import os
-from typing import List, Tuple, Union
import streamlit as st
import vertexai
@@ -23,14 +22,14 @@
@st.cache_resource
-def load_models() -> Tuple[GenerativeModel, GenerativeModel]:
+def load_models() -> tuple[GenerativeModel, GenerativeModel]:
"""Load Gemini 1.5 Flash and Pro models."""
return GenerativeModel("gemini-1.5-flash"), GenerativeModel("gemini-1.5-pro")
def get_gemini_response(
model: GenerativeModel,
- contents: Union[str, List],
+ contents: str | list,
generation_config: GenerationConfig = GenerationConfig(
temperature=0.1, max_output_tokens=2048
),
diff --git a/gemini/sample-apps/genwealth/function-scripts/process-pdf/main.py b/gemini/sample-apps/genwealth/function-scripts/process-pdf/main.py
index 2ce653e792e..3bdefbc6f3b 100644
--- a/gemini/sample-apps/genwealth/function-scripts/process-pdf/main.py
+++ b/gemini/sample-apps/genwealth/function-scripts/process-pdf/main.py
@@ -3,7 +3,6 @@
import os
from pathlib import Path
import re
-from typing import List, Optional
import uuid
import functions_framework
@@ -23,13 +22,13 @@ def batch_process_documents(
location: str,
processor_id: str,
gcs_output_uri: str,
- processor_version_id: Optional[str] = None,
- gcs_input_uri: Optional[str] = None,
- input_mime_type: Optional[str] = None,
- gcs_input_prefix: Optional[str] = None,
- field_mask: Optional[str] = None,
+ processor_version_id: str | None = None,
+ gcs_input_uri: str | None = None,
+ input_mime_type: str | None = None,
+ gcs_input_prefix: str | None = None,
+ field_mask: str | None = None,
timeout: int = 400,
-) -> List[storage.Blob]:
+) -> list[storage.Blob]:
"""Function to batch process documents"""
# You must set the `api_endpoint` if you use a location other than "us".
opts = ClientOptions(api_endpoint=f"{location}-documentai.googleapis.com")
@@ -298,8 +297,6 @@ def process_pdf(cloud_event):
ticker = Path(source_file).stem
publisher = pubsub_v1.PublisherClient()
topic_name = f"projects/{project_id}/topics/{project_id}-doc-ready"
- future = publisher.publish(
- topic_name, bytes(f"{ticker}".encode("utf-8")), spam="done"
- )
+ future = publisher.publish(topic_name, bytes(f"{ticker}".encode()), spam="done")
future.result()
print("Sent message to pubsub")
diff --git a/gemini/sample-apps/genwealth/function-scripts/update-search-index/main.py b/gemini/sample-apps/genwealth/function-scripts/update-search-index/main.py
index a041f517776..54492ce1f09 100644
--- a/gemini/sample-apps/genwealth/function-scripts/update-search-index/main.py
+++ b/gemini/sample-apps/genwealth/function-scripts/update-search-index/main.py
@@ -1,7 +1,6 @@
"""Function to update the Vertex AI Search and Conversion index"""
import os
-from typing import Optional
import functions_framework
from google.api_core.client_options import ClientOptions
@@ -12,9 +11,9 @@ def import_documents_sample(
project_id: str,
location: str,
data_store_id: str,
- gcs_uri: Optional[str] = None,
- bigquery_dataset: Optional[str] = None,
- bigquery_table: Optional[str] = None,
+ gcs_uri: str | None = None,
+ bigquery_dataset: str | None = None,
+ bigquery_table: str | None = None,
) -> str:
"""Function to import documents"""
# For more information, refer to:
diff --git "a/gemini/sample-apps/llamaindex-rag/ui/pages/1_\360\237\227\204\357\270\217 Data Sources.py" "b/gemini/sample-apps/llamaindex-rag/ui/pages/1_\360\237\227\204\357\270\217 Data Sources.py"
index 9808a587f67..1a05fe87e30 100644
--- "a/gemini/sample-apps/llamaindex-rag/ui/pages/1_\360\237\227\204\357\270\217 Data Sources.py"
+++ "b/gemini/sample-apps/llamaindex-rag/ui/pages/1_\360\237\227\204\357\270\217 Data Sources.py"
@@ -77,7 +77,7 @@ def update_index(
},
)
if response.status_code == 200:
- st.success(f"Updated data source(s) successfully!")
+ st.success("Updated data source(s) successfully!")
else:
st.error("Error updating index.")
diff --git a/gemini/sample-apps/photo-discovery/ag-web/app/app.py b/gemini/sample-apps/photo-discovery/ag-web/app/app.py
index 4336f4a64fc..9cad5e75154 100644
--- a/gemini/sample-apps/photo-discovery/ag-web/app/app.py
+++ b/gemini/sample-apps/photo-discovery/ag-web/app/app.py
@@ -14,9 +14,6 @@
import json
import os
-import re
-
-import requests
#
# Reasoning Engine
@@ -45,14 +42,12 @@
SEARCH_ENGINE_ID = ""
-search_client_options = ClientOptions(api_endpoint=f"us-discoveryengine.googleapis.com")
+search_client_options = ClientOptions(api_endpoint="us-discoveryengine.googleapis.com")
search_client = discoveryengine.SearchServiceClient(
client_options=search_client_options
)
search_serving_config = f"projects/{PROJECT_ID}/locations/us/collections/default_collection/dataStores/{SEARCH_ENGINE_ID}/servingConfigs/default_search:search"
-import json
-
def search_gms(search_query, rows):
# build a search request
diff --git a/gemini/use-cases/retrieval-augmented-generation/utils/intro_multimodal_rag_utils.py b/gemini/use-cases/retrieval-augmented-generation/utils/intro_multimodal_rag_utils.py
index 926a2b76afe..42d99dff211 100644
--- a/gemini/use-cases/retrieval-augmented-generation/utils/intro_multimodal_rag_utils.py
+++ b/gemini/use-cases/retrieval-augmented-generation/utils/intro_multimodal_rag_utils.py
@@ -1,7 +1,8 @@
+from collections.abc import Iterable
import glob
import os
import time
-from typing import Any, Dict, Iterable, List, Optional, Tuple, Union
+from typing import Any
from IPython.display import display
import PIL
@@ -30,7 +31,7 @@
def get_text_embedding_from_text_embedding_model(
text: str,
- return_array: Optional[bool] = False,
+ return_array: bool | None = False,
) -> list:
"""
Generates a numerical text embedding from a provided text input using a text embedding model.
@@ -58,8 +59,8 @@ def get_text_embedding_from_text_embedding_model(
def get_image_embedding_from_multimodal_embedding_model(
image_uri: str,
embedding_size: int = 512,
- text: Optional[str] = None,
- return_array: Optional[bool] = False,
+ text: str | None = None,
+ return_array: bool | None = False,
) -> list:
"""Extracts an image embedding from a multimodal embedding model.
The function can optionally utilize contextual text to refine the embedding.
@@ -129,7 +130,7 @@ def get_text_overlapping_chunk(
return chunked_text_dict
-def get_page_text_embedding(text_data: Union[dict, str]) -> dict:
+def get_page_text_embedding(text_data: dict | str) -> dict:
"""
* Generates embeddings for each text chunk using a specified embedding model.
* Takes a dictionary of text chunks and an embedding size as input.
@@ -219,7 +220,7 @@ def get_image_for_gemini(
image_save_dir: str,
file_name: str,
page_num: int,
-) -> Tuple[Image, str]:
+) -> tuple[Image, str]:
"""
Extracts an image from a PDF document, converts it to JPEG format, saves it to a specified directory,
and loads it as a PIL Image Object.
@@ -260,12 +261,12 @@ def get_image_for_gemini(
def get_gemini_response(
generative_multimodal_model,
- model_input: List[str],
+ model_input: list[str],
stream: bool = True,
- generation_config: Optional[GenerationConfig] = GenerationConfig(
- temperature=0.2, max_output_tokens=2048
- ),
- safety_settings: Optional[dict] = {
+ generation_config: GenerationConfig
+ | None = GenerationConfig(temperature=0.2, max_output_tokens=2048),
+ safety_settings: dict
+ | None = {
HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_NONE,
HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_NONE,
HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_NONE,
@@ -306,7 +307,7 @@ def get_gemini_response(
def get_text_metadata_df(
- filename: str, text_metadata: Dict[Union[int, str], Dict]
+ filename: str, text_metadata: dict[int | str, dict]
) -> pd.DataFrame:
"""
This function takes a filename and a text metadata dictionary as input,
@@ -322,11 +323,11 @@ def get_text_metadata_df(
A Pandas DataFrame with the extracted text, chunk text, and chunk embeddings for each page.
"""
- final_data_text: List[Dict] = []
+ final_data_text: list[dict] = []
for key, values in text_metadata.items():
for chunk_number, chunk_text in values["chunked_text_dict"].items():
- data: Dict = {}
+ data: dict = {}
data["file_name"] = filename
data["page_num"] = int(key) + 1
data["text"] = values["text"]
@@ -345,7 +346,7 @@ def get_text_metadata_df(
def get_image_metadata_df(
- filename: str, image_metadata: Dict[Union[int, str], Dict]
+ filename: str, image_metadata: dict[int | str, dict]
) -> pd.DataFrame:
"""
This function takes a filename and an image metadata dictionary as input,
@@ -361,10 +362,10 @@ def get_image_metadata_df(
A Pandas DataFrame with the extracted image path, image description, and image embeddings for each image.
"""
- final_data_image: List[Dict] = []
+ final_data_image: list[dict] = []
for key, values in image_metadata.items():
for _, image_values in values.items():
- data: Dict = {}
+ data: dict = {}
data["file_name"] = filename
data["page_num"] = int(key) + 1
data["img_num"] = int(image_values["img_num"])
@@ -392,10 +393,10 @@ def get_document_metadata(
image_save_dir: str,
image_description_prompt: str,
embedding_size: int = 128,
- generation_config: Optional[GenerationConfig] = GenerationConfig(
- temperature=0.2, max_output_tokens=2048
- ),
- safety_settings: Optional[dict] = {
+ generation_config: GenerationConfig
+ | None = GenerationConfig(temperature=0.2, max_output_tokens=2048),
+ safety_settings: dict
+ | None = {
HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_NONE,
HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_NONE,
HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_NONE,
@@ -403,7 +404,7 @@ def get_document_metadata(
},
add_sleep_after_page: bool = False,
sleep_time_after_page: int = 2,
-) -> Tuple[pd.DataFrame, pd.DataFrame]:
+) -> tuple[pd.DataFrame, pd.DataFrame]:
"""
This function takes a PDF path, an image save directory, an image description prompt, an embedding size, and a text embedding text limit as input.
@@ -435,8 +436,8 @@ def get_document_metadata(
file_name = pdf_path.split("/")[-1]
- text_metadata: Dict[Union[int, str], Dict] = {}
- image_metadata: Dict[Union[int, str], Dict] = {}
+ text_metadata: dict[int | str, dict] = {}
+ image_metadata: dict[int | str, dict] = {}
for page_num, page in enumerate(doc):
print(f"Processing page: {page_num + 1}")
@@ -582,7 +583,7 @@ def get_cosine_score(
def print_text_to_image_citation(
- final_images: Dict[int, Dict[str, Any]], print_top: bool = True
+ final_images: dict[int, dict[str, Any]], print_top: bool = True
) -> None:
"""
Prints a formatted citation for each matched image in a dictionary.
@@ -633,7 +634,7 @@ def print_text_to_image_citation(
def print_text_to_text_citation(
- final_text: Dict[int, Dict[str, Any]],
+ final_text: dict[int, dict[str, Any]],
print_top: bool = True,
chunk_text: bool = True,
) -> None:
@@ -694,7 +695,7 @@ def get_similar_image_from_query(
image_emb: bool = True,
top_n: int = 3,
embedding_size: int = 128,
-) -> Dict[int, Dict[str, Any]]:
+) -> dict[int, dict[str, Any]]:
"""
Finds the top N most similar images from a metadata DataFrame based on a text query or an image query.
@@ -737,7 +738,7 @@ def get_similar_image_from_query(
top_n_cosine_values = cosine_scores.nlargest(top_n).values.tolist()
# Create a dictionary to store matched images and their information
- final_images: Dict[int, Dict[str, Any]] = {}
+ final_images: dict[int, dict[str, Any]] = {}
for matched_imageno, indexvalue in enumerate(top_n_cosine_scores):
# Create a sub-dictionary for each matched image
@@ -798,7 +799,7 @@ def get_similar_text_from_query(
top_n: int = 3,
chunk_text: bool = True,
print_citation: bool = False,
-) -> Dict[int, Dict[str, Any]]:
+) -> dict[int, dict[str, Any]]:
"""
Finds the top N most similar text passages from a metadata DataFrame based on a text query.
@@ -838,7 +839,7 @@ def get_similar_text_from_query(
top_n_scores = cosine_scores.nlargest(top_n).values.tolist()
# Create a dictionary to store matched text and their information
- final_text: Dict[int, Dict[str, Any]] = {}
+ final_text: dict[int, dict[str, Any]] = {}
for matched_textno, index in enumerate(top_n_indices):
# Create a sub-dictionary for each matched text
@@ -879,7 +880,7 @@ def get_similar_text_from_query(
def display_images(
- images: Iterable[Union[str, PIL.Image.Image]], resize_ratio: float = 0.5
+ images: Iterable[str | PIL.Image.Image], resize_ratio: float = 0.5
) -> None:
"""
Displays a series of images provided as paths or PIL Image objects.
diff --git a/language/use-cases/document-qa/utils/matching_engine.py b/language/use-cases/document-qa/utils/matching_engine.py
index cc32826b35c..00ff34838ee 100644
--- a/language/use-cases/document-qa/utils/matching_engine.py
+++ b/language/use-cases/document-qa/utils/matching_engine.py
@@ -2,9 +2,10 @@
from __future__ import annotations
+from collections.abc import Iterable
import json
import logging
-from typing import Any, Iterable, List, Optional, Type
+from typing import Any
import uuid
import google.auth
@@ -47,7 +48,7 @@ def __init__(
index_client: aiplatform_v1.IndexServiceClient,
index_endpoint_client: aiplatform_v1.IndexEndpointServiceClient,
gcs_bucket_name: str,
- credentials: Credentials = None,
+ credentials: Credentials | None = None,
):
"""Vertex AI Matching Engine implementation of the vector store.
@@ -106,9 +107,9 @@ def _validate_google_libraries_installation(self) -> None:
def add_texts(
self,
texts: Iterable[str],
- metadatas: Optional[Iterable[dict]],
+ metadatas: Iterable[dict] | None,
**kwargs: Any,
- ) -> List[str]:
+ ) -> list[str]:
"""Run more texts through the embeddings and add to the vectorstore.
Args:
@@ -169,7 +170,7 @@ def _upload_to_gcs(self, data: str, gcs_location: str) -> None:
def get_matches(
self,
- embeddings: List[str],
+ embeddings: list[str],
n_matches: int,
index_endpoint: MatchingEngineIndexEndpoint,
filters: dict,
@@ -214,7 +215,7 @@ def similarity_search(
search_distance: float = 0.65,
filters={},
**kwargs: Any,
- ) -> List[Document]:
+ ) -> list[Document]:
"""Return docs most similar to query.
Args:
@@ -314,12 +315,12 @@ def _download_from_gcs(self, gcs_location: str) -> str:
@classmethod
def from_texts(
- cls: Type["MatchingEngine"],
- texts: List[str],
+ cls: type[MatchingEngine],
+ texts: list[str],
embedding: Embeddings,
- metadatas: Optional[List[dict]] = None,
+ metadatas: list[dict] | None = None,
**kwargs: Any,
- ) -> "MatchingEngine":
+ ) -> MatchingEngine:
"""Use from components instead."""
raise NotImplementedError(
"This method is not implemented. Instead, you should initialize the class"
@@ -329,12 +330,12 @@ def from_texts(
@classmethod
def from_documents(
- cls: Type["MatchingEngine"],
- documents: List[str],
+ cls: type[MatchingEngine],
+ documents: list[str],
embedding: Embeddings,
- metadatas: Optional[List[dict]] = None,
+ metadatas: list[dict] | None = None,
**kwargs: Any,
- ) -> "MatchingEngine":
+ ) -> MatchingEngine:
"""Use from components instead."""
raise NotImplementedError(
"This method is not implemented. Instead, you should initialize the class"
@@ -344,15 +345,15 @@ def from_documents(
@classmethod
def from_components(
- cls: Type["MatchingEngine"],
+ cls: type[MatchingEngine],
project_id: str,
region: str,
gcs_bucket_name: str,
index_id: str,
endpoint_id: str,
- credentials_path: Optional[str] = None,
- embedding: Optional[Embeddings] = None,
- ) -> "MatchingEngine":
+ credentials_path: str | None = None,
+ embedding: Embeddings | None = None,
+ ) -> MatchingEngine:
"""Takes the object creation out of the constructor.
Args:
@@ -427,8 +428,8 @@ def _validate_gcs_bucket(cls, gcs_bucket_name: str) -> str:
@classmethod
def _create_credentials_from_file(
- cls, json_credentials_path: Optional[str]
- ) -> Optional[Credentials]:
+ cls, json_credentials_path: str | None
+ ) -> Credentials | None:
"""Creates credentials for Google Cloud.
Args:
@@ -452,7 +453,7 @@ def _create_credentials_from_file(
@classmethod
def _create_index_by_id(
- cls, index_id: str, project_id: str, region: str, credentials: "Credentials"
+ cls, index_id: str, project_id: str, region: str, credentials: Credentials
) -> MatchingEngineIndex:
"""Creates a MatchingEngineIndex object by id.
@@ -472,7 +473,7 @@ def _create_index_by_id(
@classmethod
def _create_endpoint_by_id(
- cls, endpoint_id: str, project_id: str, region: str, credentials: "Credentials"
+ cls, endpoint_id: str, project_id: str, region: str, credentials: Credentials
) -> MatchingEngineIndexEndpoint:
"""Creates a MatchingEngineIndexEndpoint object by id.
@@ -498,8 +499,8 @@ def _create_endpoint_by_id(
@classmethod
def _get_gcs_client(
- cls, credentials: "Credentials", project_id: str
- ) -> "storage.Client":
+ cls, credentials: Credentials, project_id: str
+ ) -> storage.Client:
"""Lazily creates a GCS client.
Returns:
@@ -512,8 +513,8 @@ def _get_gcs_client(
@classmethod
def _get_index_client(
- cls, project_id: str, region: str, credentials: "Credentials"
- ) -> "storage.Client":
+ cls, project_id: str, region: str, credentials: Credentials
+ ) -> storage.Client:
"""Lazily creates a Matching Engine Index client.
Returns:
@@ -530,8 +531,8 @@ def _get_index_client(
@classmethod
def _get_index_endpoint_client(
- cls, project_id: str, region: str, credentials: "Credentials"
- ) -> "storage.Client":
+ cls, project_id: str, region: str, credentials: Credentials
+ ) -> storage.Client:
"""Lazily creates a Matching Engine Index Endpoint client.
Returns:
@@ -552,7 +553,7 @@ def _init_aiplatform(
project_id: str,
region: str,
gcs_bucket_name: str,
- credentials: "Credentials",
+ credentials: Credentials,
) -> None:
"""Configures the aiplatform library.
diff --git a/language/use-cases/document-qa/utils/matching_engine_utils.py b/language/use-cases/document-qa/utils/matching_engine_utils.py
index 6e5f5385dab..b1478c0b4fa 100644
--- a/language/use-cases/document-qa/utils/matching_engine_utils.py
+++ b/language/use-cases/document-qa/utils/matching_engine_utils.py
@@ -2,7 +2,6 @@
from datetime import datetime
import logging
import time
-from typing import Optional
from google.api_core.client_options import ClientOptions
from google.cloud import aiplatform_v1 as aipv1
@@ -18,7 +17,7 @@ def __init__(
project_id: str,
region: str,
index_name: str,
- index_endpoint_name: Optional[str] = None,
+ index_endpoint_name: str | None = None,
):
self.project_id = project_id
self.region = region
@@ -167,7 +166,7 @@ def deploy_index(
min_replica_count: int = 2,
max_replica_count: int = 10,
public_endpoint_enabled: bool = True,
- network: Optional[str] = None,
+ network: str | None = None,
):
try:
# Get index if exists
diff --git a/noxfile.py b/noxfile.py
index beaf82f205d..1ef53e1cea3 100644
--- a/noxfile.py
+++ b/noxfile.py
@@ -18,13 +18,11 @@
# Generated by synthtool. DO NOT EDIT!
-from __future__ import absolute_import
import os
import pathlib
import re
import shutil
-from typing import Dict, List
import warnings
import nox
@@ -36,7 +34,7 @@
DEFAULT_PYTHON_VERSION = "3.10"
-UNIT_TEST_PYTHON_VERSIONS: List[str] = ["3.10", "3.11", "3.12"]
+UNIT_TEST_PYTHON_VERSIONS: list[str] = ["3.10", "3.11", "3.12"]
UNIT_TEST_STANDARD_DEPENDENCIES = [
"mock",
"asyncmock",
@@ -44,23 +42,23 @@
"pytest-cov",
"pytest-asyncio",
]
-UNIT_TEST_EXTERNAL_DEPENDENCIES: List[str] = []
-UNIT_TEST_LOCAL_DEPENDENCIES: List[str] = []
-UNIT_TEST_DEPENDENCIES: List[str] = []
-UNIT_TEST_EXTRAS: List[str] = []
-UNIT_TEST_EXTRAS_BY_PYTHON: Dict[str, List[str]] = {}
-
-SYSTEM_TEST_PYTHON_VERSIONS: List[str] = ["3.8"]
-SYSTEM_TEST_STANDARD_DEPENDENCIES: List[str] = [
+UNIT_TEST_EXTERNAL_DEPENDENCIES: list[str] = []
+UNIT_TEST_LOCAL_DEPENDENCIES: list[str] = []
+UNIT_TEST_DEPENDENCIES: list[str] = []
+UNIT_TEST_EXTRAS: list[str] = []
+UNIT_TEST_EXTRAS_BY_PYTHON: dict[str, list[str]] = {}
+
+SYSTEM_TEST_PYTHON_VERSIONS: list[str] = ["3.8"]
+SYSTEM_TEST_STANDARD_DEPENDENCIES: list[str] = [
"mock",
"pytest",
"google-cloud-testutils",
]
-SYSTEM_TEST_EXTERNAL_DEPENDENCIES: List[str] = []
-SYSTEM_TEST_LOCAL_DEPENDENCIES: List[str] = []
-SYSTEM_TEST_DEPENDENCIES: List[str] = []
-SYSTEM_TEST_EXTRAS: List[str] = []
-SYSTEM_TEST_EXTRAS_BY_PYTHON: Dict[str, List[str]] = {}
+SYSTEM_TEST_EXTERNAL_DEPENDENCIES: list[str] = []
+SYSTEM_TEST_LOCAL_DEPENDENCIES: list[str] = []
+SYSTEM_TEST_DEPENDENCIES: list[str] = []
+SYSTEM_TEST_EXTRAS: list[str] = []
+SYSTEM_TEST_EXTRAS_BY_PYTHON: dict[str, list[str]] = {}
CURRENT_DIRECTORY = pathlib.Path(__file__).parent.absolute()
@@ -112,9 +110,22 @@ def format(session):
Run isort to sort imports. Then run black
to format code to uniform standard.
"""
- session.install(BLACK_VERSION, ISORT_VERSION)
+ session.install(BLACK_VERSION, ISORT_VERSION, "autoflake", "ruff")
# Use the --fss option to sort imports using strict alphabetical order.
# See https://pycqa.github.io/isort/docs/configuration/options.html#force-sort-within-sections
+ session.run(
+ "autoflake",
+ "-i",
+ "-r",
+ "--remove-all-unused-imports",
+ *LINT_PATHS,
+ )
+ session.run(
+ "ruff",
+ "check",
+ "--fix-only",
+ *LINT_PATHS,
+ )
session.run(
"isort",
"--fss",
@@ -150,7 +161,9 @@ def format_notebooks(session):
session.run(
"nbqa", "pyupgrade", "--exit-zero-even-if-changed", "--py310-plus", *LINT_PATHS
)
- session.run("nbqa", "autoflake", "-i", "--remove-all-unused-imports", *LINT_PATHS)
+ session.run(
+ "nbqa", "autoflake", "-i", "--remove-all-unused-imports", "-r", *LINT_PATHS
+ )
session.run(
"nbqa",
"isort",
diff --git a/owlbot.py b/owlbot.py
index 52984db5144..9f1ff224bb5 100644
--- a/owlbot.py
+++ b/owlbot.py
@@ -39,7 +39,7 @@
# Sort Spelling Allowlist
spelling_allow_file = ".github/actions/spelling/allow.txt"
-with open(spelling_allow_file, "r", encoding="utf-8") as file:
+with open(spelling_allow_file, encoding="utf-8") as file:
unique_words = sorted(set(file))
with open(spelling_allow_file, "w", encoding="utf-8") as file:
diff --git a/search/cloud-function/python/main.py b/search/cloud-function/python/main.py
index f3c1c19fd9b..a0180d8d464 100644
--- a/search/cloud-function/python/main.py
+++ b/search/cloud-function/python/main.py
@@ -23,7 +23,7 @@
"""
import os
-from typing import Any, Dict, Tuple
+from typing import Any
from flask import Flask, Request, jsonify, request
import functions_framework
@@ -53,7 +53,7 @@
@functions_framework.http
-def vertex_ai_search(http_request: Request) -> Tuple[Any, int, Dict[str, str]]:
+def vertex_ai_search(http_request: Request) -> tuple[Any, int, dict[str, str]]:
"""
Handle HTTP requests for Vertex AI Search.
@@ -88,7 +88,7 @@ def vertex_ai_search(http_request: Request) -> Tuple[Any, int, Dict[str, str]]:
def create_error_response(
message: str, status_code: int
- ) -> Tuple[Any, int, Dict[str, str]]:
+ ) -> tuple[Any, int, dict[str, str]]:
"""Standardize the error responses with common headers."""
return (jsonify({"error": message}), status_code, headers)
@@ -119,7 +119,7 @@ def create_error_response(
app = Flask(__name__)
@app.route("/", methods=["POST"])
- def index() -> Tuple[Any, int, Dict[str, str]]:
+ def index() -> tuple[Any, int, dict[str, str]]:
"""
Flask route for handling POST requests when running locally.
diff --git a/search/cloud-function/python/test_integration_vertex_search_client.py b/search/cloud-function/python/test_integration_vertex_search_client.py
index 3ddfb106133..b6587d4aab3 100644
--- a/search/cloud-function/python/test_integration_vertex_search_client.py
+++ b/search/cloud-function/python/test_integration_vertex_search_client.py
@@ -19,8 +19,8 @@
environment variables and access to the Vertex AI Search service.
"""
+from collections.abc import Generator
import os
-from typing import Generator
import pytest
from vertex_ai_search_client import VertexAISearchClient, VertexAISearchConfig
diff --git a/search/cloud-function/python/vertex_ai_search_client.py b/search/cloud-function/python/vertex_ai_search_client.py
index a4a4e82cbe9..720cef64e69 100644
--- a/search/cloud-function/python/vertex_ai_search_client.py
+++ b/search/cloud-function/python/vertex_ai_search_client.py
@@ -35,7 +35,7 @@
import html
import json
import re
-from typing import Any, Dict, List, Literal, Union
+from typing import Any, Literal
from google.api_core.client_options import ClientOptions
from google.cloud import discoveryengine_v1alpha as discoveryengine
@@ -61,9 +61,9 @@ class VertexAISearchConfig:
project_id: str
location: str
data_store_id: str
- engine_data_type: Union[EngineDataTypeStr, str]
- engine_chunk_type: Union[EngineChunkTypeStr, str]
- summary_type: Union[SummaryTypeStr, str]
+ engine_data_type: EngineDataTypeStr | str
+ engine_chunk_type: EngineChunkTypeStr | str
+ summary_type: SummaryTypeStr | str
def __post_init__(self) -> None:
"""Validate and convert string inputs to appropriate types."""
@@ -85,7 +85,7 @@ def _validate_enum(value: str, enum_type: Any, default: str) -> str:
print(f"Warning: Invalid value '{value}'. Using default: '{default}'")
return default
- def to_dict(self) -> Dict[str, str]:
+ def to_dict(self) -> dict[str, str]:
"""Convert the config to a dictionary."""
return {
"project_id": self.project_id,
@@ -144,7 +144,7 @@ def _get_serving_config(self) -> str:
serving_config="default_config",
)
- def search(self, query: str, page_size: int = 10) -> Dict[str, Any]:
+ def search(self, query: str, page_size: int = 10) -> dict[str, Any]:
"""
Perform a search query using Vertex AI Search.
@@ -218,7 +218,7 @@ def build_search_request(
),
)
- def map_search_pager_to_dict(self, pager: SearchPager) -> Dict[str, Any]:
+ def map_search_pager_to_dict(self, pager: SearchPager) -> dict[str, Any]:
"""
Maps a SearchPager to a dictionary structure, iterativly requesting results.
@@ -230,7 +230,7 @@ def map_search_pager_to_dict(self, pager: SearchPager) -> Dict[str, Any]:
Returns:
Dict[str, Any]: A dictionary containing the search results and metadata.
"""
- output: Dict[str, Any] = {
+ output: dict[str, Any] = {
"results": [
SearchResponse.SearchResult.to_dict(result) for result in pager
],
@@ -267,7 +267,7 @@ def map_search_pager_to_dict(self, pager: SearchPager) -> Dict[str, Any]:
return output
- def simplify_search_results(self, response: Dict[str, Any]) -> Dict[str, Any]:
+ def simplify_search_results(self, response: dict[str, Any]) -> dict[str, Any]:
"""
Simplify the search results by parsing documents and chunks.
@@ -290,7 +290,7 @@ def simplify_search_results(self, response: Dict[str, Any]) -> Dict[str, Any]:
response["simplified_results"] = simplified_results
return response
- def _parse_document_result(self, document: Dict[str, Any]) -> Dict[str, Any]:
+ def _parse_document_result(self, document: dict[str, Any]) -> dict[str, Any]:
"""
Parse a single document result from the search response.
@@ -317,7 +317,7 @@ def _parse_document_result(self, document: Dict[str, Any]) -> Dict[str, Any]:
json_data = {}
metadata.update(json_data)
- result: Dict[str, Any] = {"metadata": metadata}
+ result: dict[str, Any] = {"metadata": metadata}
if self.config.engine_data_type == "STRUCTURED":
structured_data = (
@@ -337,7 +337,7 @@ def _parse_document_result(self, document: Dict[str, Any]) -> Dict[str, Any]:
return result
- def _parse_segments(self, segments: List[Dict[str, Any]]) -> str:
+ def _parse_segments(self, segments: list[dict[str, Any]]) -> str:
"""
Parse extractive segments from a single document of search results.
@@ -361,7 +361,7 @@ def _parse_segments(self, segments: List[Dict[str, Any]]) -> str:
for segment in parsed_segments
)
- def _parse_snippets(self, snippets: List[Dict[str, Any]]) -> str:
+ def _parse_snippets(self, snippets: list[dict[str, Any]]) -> str:
"""
Parse snippets from a single document of search results.
@@ -377,7 +377,7 @@ def _parse_snippets(self, snippets: List[Dict[str, Any]]) -> str:
if snippet.get("snippetStatus") == "SUCCESS"
)
- def _parse_chunk_result(self, chunk: Dict[str, Any]) -> Dict[str, Any]:
+ def _parse_chunk_result(self, chunk: dict[str, Any]) -> dict[str, Any]:
"""
Parse a single chunk result from the search response.
diff --git a/search/web-app/ekg_utils.py b/search/web-app/ekg_utils.py
index 1b823a52444..374500c8a3c 100644
--- a/search/web-app/ekg_utils.py
+++ b/search/web-app/ekg_utils.py
@@ -13,8 +13,8 @@
# limitations under the License.
"""Enterprise Knowledge Graph Utilities"""
+from collections.abc import Sequence
import json
-from typing import List, Optional, Sequence, Tuple
from google.cloud import enterpriseknowledgegraph as ekg
@@ -26,10 +26,10 @@ def search_public_kg(
project_id: str,
location: str,
search_query: str,
- languages: Optional[Sequence[str]] = None,
- types: Optional[Sequence[str]] = None,
- limit: Optional[int] = None,
-) -> Tuple:
+ languages: Sequence[str] | None = None,
+ types: Sequence[str] | None = None,
+ limit: int | None = None,
+) -> tuple:
"""
Make API Request to Public Knowledge Graph.
"""
@@ -58,7 +58,7 @@ def search_public_kg(
return entities, request_url, request_json, response_json
-def get_entities(response: ekg.SearchPublicKgResponse) -> List:
+def get_entities(response: ekg.SearchPublicKgResponse) -> list:
"""
Extract Entities from Knowledge Graph Response
"""
diff --git a/search/web-app/genappbuilder_utils.py b/search/web-app/genappbuilder_utils.py
index 6ddec4837d0..d43b89ba192 100644
--- a/search/web-app/genappbuilder_utils.py
+++ b/search/web-app/genappbuilder_utils.py
@@ -14,7 +14,6 @@
"""Vertex AI Search Utilities"""
from os.path import basename
-from typing import Dict, List, Optional, Tuple
from google.cloud import discoveryengine_v1alpha as discoveryengine
@@ -25,7 +24,7 @@ def list_documents(
project_id: str,
location: str,
datastore_id: str,
-) -> List[Dict[str, str]]:
+) -> list[dict[str, str]]:
client = discoveryengine.DocumentServiceClient()
parent = client.branch_path(
@@ -48,15 +47,15 @@ def list_documents(
def search_enterprise_search(
project_id: str,
location: str,
- data_store_id: Optional[str] = None,
- engine_id: Optional[str] = None,
+ data_store_id: str | None = None,
+ engine_id: str | None = None,
page_size: int = 50,
- search_query: Optional[str] = None,
- image_bytes: Optional[bytes] = None,
- params: Optional[Dict] = None,
- summary_model: Optional[str] = None,
- summary_preamble: Optional[str] = None,
-) -> Tuple[List[Dict[str, str | List]], str, str, str, str]:
+ search_query: str | None = None,
+ image_bytes: bytes | None = None,
+ params: dict | None = None,
+ summary_model: str | None = None,
+ summary_preamble: str | None = None,
+) -> tuple[list[dict[str, str | list]], str, str, str, str]:
if bool(search_query) == bool(image_bytes):
raise ValueError("Cannot provide both search_query and image_bytes")
@@ -157,14 +156,14 @@ def search_enterprise_search(
def get_enterprise_search_results(
response: discoveryengine.SearchResponse,
-) -> List[Dict[str, str | List]]:
+) -> list[dict[str, str | list]]:
"""
Extract Results from Enterprise Search Response
"""
ROBOT = "https://www.google.com/images/errors/robot.png"
- def get_thumbnail_image(data: Dict) -> str:
+ def get_thumbnail_image(data: dict) -> str:
cse_thumbnail = data.get("pagemap", {}).get("cse_thumbnail")
image_link = data.get("image", {}).get("thumbnailLink")
@@ -174,7 +173,7 @@ def get_thumbnail_image(data: Dict) -> str:
return image_link
return ROBOT
- def get_formatted_link(data: Dict) -> str:
+ def get_formatted_link(data: dict) -> str:
html_formatted_url = data.get("htmlFormattedUrl")
image_context_link = data.get("image", {}).get("contextLink")
link = data.get("link")
@@ -220,9 +219,9 @@ def recommend_personalize(
datastore_id: str,
serving_config_id: str,
document_id: str,
- user_pseudo_id: Optional[str] = "xxxxxxxxxxx",
- attribution_token: Optional[str] = None,
-) -> Tuple:
+ user_pseudo_id: str | None = "xxxxxxxxxxx",
+ attribution_token: str | None = None,
+) -> tuple:
# Create a client
client = discoveryengine.RecommendationServiceClient()
@@ -271,7 +270,7 @@ def get_storage_link(uri: str) -> str:
def get_personalize_results(
response: discoveryengine.RecommendResponse,
-) -> List[Dict]:
+) -> list[dict]:
"""
Extract Results from Personalize Response
"""
From 74ef55b64c33a088e7c75eec75552d52d08dac35 Mon Sep 17 00:00:00 2001
From: nhootan <103317089+nhootan@users.noreply.github.com>
Date: Wed, 18 Sep 2024 13:30:57 -0400
Subject: [PATCH 07/10] feat: Adding the initial version of Vertex Prompt
Optimizer UI Notebook. (#1099)
# Description
Adding the first version of Vertex AI Prompt Optimizer UI Notebook.
---------
Co-authored-by: hootan
Co-authored-by: Owl Bot
Co-authored-by: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
---
.github/CODEOWNERS | 1 +
.github/actions/spelling/allow.txt | 3 +
.../vertex_ai_prompt_optimizer_ui.ipynb | 953 ++++++++++++++++++
3 files changed, 957 insertions(+)
create mode 100644 gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb
diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
index 0b6764fb268..05e6c60cf79 100644
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -28,6 +28,7 @@
/generative-ai/language/grounding @koverholt @holtskinner @GoogleCloudPlatform/generative-ai-devrel
/generative-ai/language/orchestration/langchain @kweinmeister @RajeshThallam @GoogleCloudPlatform/generative-ai-devrel
/generative-ai/language/prompts @polong-lin @GoogleCloudPlatform/generative-ai-devrel
+/generative-ai/language/prompts/prompt_optimizer @nhootan @inardini @GoogleCloudPlatform/generative-ai-devrel
/generative-ai/language/sample-apps @rominirani @GoogleCloudPlatform/generative-ai-devrel
/generative-ai/language/translation @holtskinner @GoogleCloudPlatform/generative-ai-devrel
/generative-ai/language/tuning @erwinh85 @GoogleCloudPlatform/generative-ai-devrel
diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt
index f9346345d34..05f313ccb60 100644
--- a/.github/actions/spelling/allow.txt
+++ b/.github/actions/spelling/allow.txt
@@ -229,6 +229,7 @@ Unimicron
Upserting
Urs
Uszkoreit
+VAPO
VFT
VMs
VOS
@@ -272,6 +273,7 @@ apredict
aquery
arXiv
aretrieve
+argmax
arun
astype
autoflake
@@ -329,6 +331,7 @@ docstore
dpi
draig
drinkware
+dropdown
dropna
dsl
dtypes
diff --git a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb
new file mode 100644
index 00000000000..aacc919c5bd
--- /dev/null
+++ b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb
@@ -0,0 +1,953 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "hlI1rYKa2IGx"
+ },
+ "outputs": [],
+ "source": [
+ "# Copyright 2024 Google LLC\n",
+ "#\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "pHyuJTFr2IGx"
+ },
+ "source": [
+ "# Overview\n",
+ "Welcome to Vertex AI Prompt Optimizer (VAPO)! This Notebook showcases VAPO, a tool that iteratively optimizes prompts to suit a target model (e.g., `gemini-1.5-pro`) using target-specific metric(s).\n",
+ "\n",
+ "Key Use Cases:\n",
+ "\n",
+ "* Prompt Optimization: Enhance the quality of an initial prompt by refining its structure and content to match the target model's optimal input characteristics.\n",
+ "\n",
+ "* Prompt Translation: Adapt prompts optimized for one model to work effectively with a different target model."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tTtKHedrO1Rx"
+ },
+ "source": [
+ "# Step 0: Install packages and libraries"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "8-Zw72vFORz_"
+ },
+ "outputs": [],
+ "source": [
+ "! pip3 install -U google-cloud-aiplatform -q\n",
+ "\n",
+ "import datetime\n",
+ "import os\n",
+ "import time\n",
+ "\n",
+ "from IPython.display import HTML, display\n",
+ "from google.auth import default\n",
+ "from google.cloud import aiplatform, storage\n",
+ "from google.colab import auth, output\n",
+ "import gspread\n",
+ "import ipywidgets as widgets\n",
+ "import jinja2\n",
+ "from jinja2 import BaseLoader, Environment\n",
+ "import jinja2.meta\n",
+ "import pandas as pd\n",
+ "import tensorflow.io.gfile as gfile\n",
+ "\n",
+ "output.enable_custom_widget_manager()\n",
+ "from io import StringIO\n",
+ "import json\n",
+ "import re\n",
+ "\n",
+ "\n",
+ "def authenticate():\n",
+ " auth.authenticate_user()\n",
+ " creds, _ = default()\n",
+ " return gspread.authorize(creds)\n",
+ "\n",
+ "\n",
+ "def is_target_required_metric(eval_metric: str) -> bool:\n",
+ " return eval_metric in [\n",
+ " \"bleu\",\n",
+ " \"exact_match\",\n",
+ " \"question_answering_correctness\",\n",
+ " \"rouge_1\",\n",
+ " \"rouge_2\",\n",
+ " \"rouge_l\",\n",
+ " \"rouge_l_sum\",\n",
+ " \"tool_call_valid\",\n",
+ " \"tool_name_match\",\n",
+ " \"tool_parameter_key_match\",\n",
+ " \"tool_parameter_kv_match\",\n",
+ " ]\n",
+ "\n",
+ "\n",
+ "def is_run_target_required(eval_metric_types: list[str], source_model: str) -> bool:\n",
+ " if source_model:\n",
+ " return False\n",
+ "\n",
+ " label_required = False\n",
+ " for metric in eval_metric_types:\n",
+ " label_required = label_required or is_target_required_metric(metric)\n",
+ " return label_required\n",
+ "\n",
+ "\n",
+ "_TARGET_KEY = \"target\"\n",
+ "\n",
+ "\n",
+ "def validate_prompt_and_data(\n",
+ " template: str,\n",
+ " dataset_path: str,\n",
+ " placeholder_to_content: str,\n",
+ " label_enforced: bool,\n",
+ ") -> None:\n",
+ " \"\"\"Validates the prompt template and the dataset.\"\"\"\n",
+ " placeholder_to_content = json.loads(placeholder_to_content)\n",
+ " with gfile.GFile(dataset_path, \"r\") as f:\n",
+ " data = [json.loads(line) for line in f.readlines()]\n",
+ "\n",
+ " env = jinja2.Environment()\n",
+ " try:\n",
+ " parsed_content = env.parse(template)\n",
+ " except jinja2.exceptions.TemplateSyntaxError as e:\n",
+ " raise ValueError(f\"Invalid template: {template}\") from e\n",
+ "\n",
+ " template_variables = jinja2.meta.find_undeclared_variables(parsed_content)\n",
+ " extra_keys = set()\n",
+ " for ex in data:\n",
+ " ex.update(placeholder_to_content)\n",
+ " missing_keys = [key for key in template_variables if key not in ex]\n",
+ " extra_keys.update([key for key in ex if key not in template_variables])\n",
+ " if label_enforced:\n",
+ " if _TARGET_KEY not in ex:\n",
+ " raise ValueError(\n",
+ " f\"The example {ex} doesn't have a key corresponding to the target\"\n",
+ " f\" var: {_TARGET_KEY}\"\n",
+ " )\n",
+ " if not ex[_TARGET_KEY]:\n",
+ " raise ValueError(f\"The following example has an empty target: {ex}\")\n",
+ " if missing_keys:\n",
+ " raise ValueError(\n",
+ " f\"The example {ex} doesn't have a key corresponding to following\"\n",
+ " f\" template vars: {missing_keys}\"\n",
+ " )\n",
+ " if extra_keys:\n",
+ " raise Warning(\n",
+ " \"Warning: extra keys in the examples not used in the context/task\"\n",
+ " f\" template {extra_keys}\"\n",
+ " )\n",
+ "\n",
+ "\n",
+ "def run_custom_job(\n",
+ " display_name: str,\n",
+ " container_uri: str,\n",
+ " container_args: dict[str, str],\n",
+ ") -> None:\n",
+ " \"\"\"A sample to create custom jobs.\"\"\"\n",
+ " worker_pool_specs = [\n",
+ " {\n",
+ " \"replica_count\": 1,\n",
+ " \"container_spec\": {\n",
+ " \"image_uri\": container_uri,\n",
+ " \"args\": [f\"--{k}={v}\" for k, v in container_args.items()],\n",
+ " },\n",
+ " \"machine_spec\": {\n",
+ " \"machine_type\": \"n1-standard-4\",\n",
+ " },\n",
+ " }\n",
+ " ]\n",
+ "\n",
+ " custom_job = aiplatform.CustomJob(\n",
+ " display_name=display_name,\n",
+ " worker_pool_specs=worker_pool_specs,\n",
+ " )\n",
+ " custom_job.submit()\n",
+ " return custom_job\n",
+ "\n",
+ "\n",
+ "def run_apd(config: dict[str, str], bucket_uri: str, display_name: str) -> None:\n",
+ " \"\"\"A function to the vertex prompt optimizer.\"\"\"\n",
+ " print(f\"\\n\\nJob display name: {display_name}\")\n",
+ " version = \"preview_v1_0\"\n",
+ " container_uri = \"us-docker.pkg.dev/vertex-ai-restricted/builtin-algorithm/apd\"\n",
+ " config_path = f\"{bucket_uri}/{display_name}/input_config.json\"\n",
+ "\n",
+ " with gfile.GFile(config_path, \"w\") as f:\n",
+ " json.dump(config, f)\n",
+ "\n",
+ " aiplatform.init(\n",
+ " project=config[\"project\"],\n",
+ " location=config[\"target_model_location\"],\n",
+ " staging_bucket=f\"{bucket_uri}/{display_name}\",\n",
+ " )\n",
+ "\n",
+ " return run_custom_job(\n",
+ " display_name=display_name,\n",
+ " container_uri=f\"{container_uri}:{version}\",\n",
+ " container_args={\"config\": config_path},\n",
+ " )\n",
+ "\n",
+ "\n",
+ "def update_best_display(\n",
+ " df: pd.DataFrame,\n",
+ " textarea: widgets.Textarea,\n",
+ " best_score_label: widgets.Label,\n",
+ " eval_metric: str,\n",
+ ") -> None:\n",
+ " \"\"\"Update the best prompt display.\"\"\"\n",
+ "\n",
+ " df[\"score\"] = df[f\"metrics.{eval_metric}/mean\"]\n",
+ "\n",
+ " best_template = df.loc[df[\"score\"].argmax(), \"prompt\"]\n",
+ " best_score = df.loc[df[\"score\"].argmax(), \"score\"]\n",
+ " original_score = df.loc[0, \"score\"]\n",
+ "\n",
+ " def placeholder_llm():\n",
+ " return \"{{llm()}}\"\n",
+ "\n",
+ " env = Environment(loader=BaseLoader())\n",
+ " env.globals[\"llm\"] = placeholder_llm\n",
+ "\n",
+ " best_template = best_template.replace(\"store('answer', llm())\", \"llm()\")\n",
+ " textarea.value = best_template\n",
+ " improvement = best_score - original_score\n",
+ " no_improvement_str = \"\\nNo better template is found yet.\" if not improvement else \"\"\n",
+ " best_score_label.value = (\n",
+ " f\"Score: {best_score}\" f\" Improvement: {improvement: .3f} {no_improvement_str}\"\n",
+ " )\n",
+ "\n",
+ "\n",
+ "def generate_dataframe(filename: str) -> pd.DataFrame:\n",
+ " \"\"\"Generates a pandas dataframe from a json file.\"\"\"\n",
+ " if not gfile.exists(filename):\n",
+ " return pd.DataFrame()\n",
+ "\n",
+ " with gfile.GFile(filename, \"r\") as f:\n",
+ " try:\n",
+ " data = json.load(f)\n",
+ " except:\n",
+ " return pd.DataFrame()\n",
+ " return pd.json_normalize(data)\n",
+ "\n",
+ "\n",
+ "def left_aligned_df_html(df: pd.DataFrame) -> None:\n",
+ " \"\"\"Displays a Pandas DataFrame in Colab with left-aligned values.\"\"\"\n",
+ "\n",
+ " # Convert to HTML table, but keep the HTML in a variable\n",
+ " html_table = df.to_html(index=False, classes=\"left-aligned\")\n",
+ "\n",
+ " # Add CSS styling to left-align table data cells and override default styles\n",
+ " styled_html = f\"\"\"\n",
+ " \n",
+ " {html_table}\n",
+ " \"\"\"\n",
+ "\n",
+ " # Display the styled HTML table\n",
+ " return HTML(styled_html)\n",
+ "\n",
+ "\n",
+ "def extract_top_level_function_name(source_code: str) -> str | None:\n",
+ " match = re.search(r\"^def\\s+([a-zA-Z_]\\w*)\\s*\\(\", source_code, re.MULTILINE)\n",
+ " if match:\n",
+ " return match.group(1)\n",
+ " return None\n",
+ "\n",
+ "\n",
+ "class ProgressForm:\n",
+ " \"\"\"A class to display the progress of the optimization job.\"\"\"\n",
+ "\n",
+ " def __init__(self):\n",
+ " self.instruction_progress_bar = None\n",
+ " self.instruction_display = None\n",
+ " self.instruction_best = None\n",
+ " self.instruction_score = None\n",
+ "\n",
+ " self.demo_progress_bar = None\n",
+ " self.demo_display = None\n",
+ " self.demo_best = None\n",
+ " self.demo_score = None\n",
+ "\n",
+ " self.job_state_display = None\n",
+ "\n",
+ " self.instruction_df = None\n",
+ " self.demo_df = None\n",
+ "\n",
+ " self.started = False\n",
+ "\n",
+ " def init(self, params: dict[str, str]):\n",
+ " \"\"\"Initialize the progress form.\"\"\"\n",
+ " self.job_state_display = display(\n",
+ " HTML(\"Job State: Not Started!\"), display_id=True\n",
+ " )\n",
+ " self.status_display = display(HTML(\"\"), display_id=True)\n",
+ "\n",
+ " if params[\"optimization_mode\"] in [\"instruction\", \"instruction_and_demo\"]:\n",
+ " (\n",
+ " self.instruction_progress_bar,\n",
+ " self.instruction_display,\n",
+ " self.instruction_best,\n",
+ " self.instruction_score,\n",
+ " ) = self.create_progress_ui(\"Instruction\", params[\"num_steps\"])\n",
+ "\n",
+ " if params[\"optimization_mode\"] in [\"demonstration\", \"instruction_and_demo\"]:\n",
+ " (\n",
+ " self.demo_progress_bar,\n",
+ " self.demo_display,\n",
+ " self.demo_best,\n",
+ " self.demo_score,\n",
+ " ) = self.create_progress_ui(\n",
+ " \"Demonstration\", params[\"num_demo_set_candidates\"]\n",
+ " )\n",
+ "\n",
+ " eval_metric = \"composite_metric\"\n",
+ " if len(params[\"eval_metrics_types\"]) == 1:\n",
+ " eval_metric = params[\"eval_metrics_types\"][0]\n",
+ "\n",
+ " if eval_metric != \"composite_metric\" and \"custom_metric_source_code\" in params:\n",
+ " self.eval_metric = extract_top_level_function_name(\n",
+ " params[\"custom_metric_source_code\"]\n",
+ " )\n",
+ " else:\n",
+ " self.eval_metric = eval_metric\n",
+ "\n",
+ " self.output_path = params[\"output_path\"]\n",
+ " self.started = True\n",
+ "\n",
+ " def update_progress(\n",
+ " self,\n",
+ " progress_bar: widgets.IntProgress,\n",
+ " templates_file: str,\n",
+ " df: pd.DataFrame | None,\n",
+ " df_display: display,\n",
+ " best_textarea: widgets.Textarea,\n",
+ " best_score: widgets.Label,\n",
+ " eval_metric: str,\n",
+ " ):\n",
+ " \"\"\"Update the progress of the optimization job.\"\"\"\n",
+ "\n",
+ " def get_last_step(df: pd.DataFrame):\n",
+ " if df.empty:\n",
+ " return -1\n",
+ " return int(df[\"step\"].max())\n",
+ "\n",
+ " if progress_bar is None or df is None:\n",
+ " return pd.DataFrame()\n",
+ "\n",
+ " new_df = generate_dataframe(templates_file)\n",
+ "\n",
+ " last_step = get_last_step(df)\n",
+ " new_last_step = get_last_step(new_df)\n",
+ " if new_last_step > last_step:\n",
+ " df_display.update(left_aligned_df_html(new_df))\n",
+ " update_best_display(new_df, best_textarea, best_score, eval_metric)\n",
+ " progress_bar.value = progress_bar.value + new_last_step - last_step\n",
+ "\n",
+ " return new_df\n",
+ "\n",
+ " def create_progress_ui(\n",
+ " self, opt_mode: str, num_opt_steps: int\n",
+ " ) -> tuple[widgets.IntProgress, display, widgets.Textarea, widgets.Label]:\n",
+ " \"\"\"Create the progress UI for a specific optimization mode.\"\"\"\n",
+ " print(f\"\\n\\n{opt_mode} Optimization\")\n",
+ " progress_bar = widgets.IntProgress(\n",
+ " value=0, min=0, max=num_opt_steps, step=1, description=\"Progress\"\n",
+ " )\n",
+ " display(progress_bar)\n",
+ " print(\"\\nGenerated Templates:\")\n",
+ " templates_display = display(\"No template is evaluated yet!\", display_id=True)\n",
+ "\n",
+ " print(\"\\nBest Template so far:\")\n",
+ " best_textarea = widgets.Textarea(\n",
+ " value=\"NA\",\n",
+ " disabled=False,\n",
+ " layout=widgets.Layout(width=\"80%\", height=\"150px\"),\n",
+ " )\n",
+ " display(best_textarea)\n",
+ "\n",
+ " best_score = widgets.Label(value=\"Score: NA Improvement: NA\")\n",
+ " display(best_score)\n",
+ "\n",
+ " return progress_bar, templates_display, best_textarea, best_score\n",
+ "\n",
+ " def monitor_progress(self, job: aiplatform.CustomJob, params: dict[str, str]):\n",
+ " \"\"\"Monitor the progress of the optimization job.\"\"\"\n",
+ " if not self.started:\n",
+ " self.init(params)\n",
+ "\n",
+ " self.job_state_display.update(HTML(f\"Job State: {job.state.name}\"))\n",
+ "\n",
+ " # Initial display of the dataframe\n",
+ " instruction_templates_file = f\"{self.output_path}/instruction/templates.json\"\n",
+ " demo_templates_file = f\"{self.output_path}/demonstration/templates.json\"\n",
+ "\n",
+ " if not job.done():\n",
+ " self.instruction_df = self.update_progress(\n",
+ " self.instruction_progress_bar,\n",
+ " instruction_templates_file,\n",
+ " self.instruction_df,\n",
+ " self.instruction_display,\n",
+ " self.instruction_best,\n",
+ " self.instruction_score,\n",
+ " self.eval_metric,\n",
+ " )\n",
+ " self.demo_df = self.update_progress(\n",
+ " self.demo_progress_bar,\n",
+ " demo_templates_file,\n",
+ " self.demo_df,\n",
+ " self.demo_display,\n",
+ " self.demo_best,\n",
+ " self.demo_score,\n",
+ " self.eval_metric,\n",
+ " )\n",
+ " return True\n",
+ "\n",
+ " if job.state.name != \"JOB_STATE_SUCCEEDED\":\n",
+ " errors = [f\"Error: Job failed with error {job.error}.\"]\n",
+ " for err_file in [\n",
+ " f\"{self.output_path}/instruction/error.json\",\n",
+ " f\"{self.output_path}/demonstration/error.json\",\n",
+ " ]:\n",
+ " if gfile.exists(err_file):\n",
+ " with gfile.GFile(err_file, \"r\") as f:\n",
+ " error_json = json.load(f)\n",
+ " errors.append(f\"Detailed error: {error_json}\")\n",
+ " errors.append(\n",
+ " f\"Please feel free to send {err_file} to the VAPO team to help\"\n",
+ " \" resolving the issue.\"\n",
+ " )\n",
+ "\n",
+ " errors.append(\n",
+ " \"All the templates found before failure can be found under\"\n",
+ " f\" {self.output_path}\"\n",
+ " )\n",
+ " errors.append(\n",
+ " \"Please consider rerunning to make sure the failure is intransient.\"\n",
+ " )\n",
+ " err = \"\\n\".join(errors)\n",
+ " self.status_display.update(HTML(f'{err}'))\n",
+ " else:\n",
+ " self.status_display.update(\n",
+ " HTML(\n",
+ " 'Job succeeded! All the'\n",
+ " f\" artifacts can be found under {self.output_path}\"\n",
+ " )\n",
+ " )\n",
+ " return False\n",
+ "\n",
+ "\n",
+ "def display_dataframe(df: pd.DataFrame) -> None:\n",
+ " \"\"\"Display a pandas dataframe in Colab.\"\"\"\n",
+ "\n",
+ " # Function to wrap text in a scrollable div\n",
+ " def wrap_in_scrollable_div(text):\n",
+ " return f'{text}
'\n",
+ "\n",
+ " # Apply the function to every cell using the format method\n",
+ " styled_html = df.style.format(wrap_in_scrollable_div).to_html(index=False)\n",
+ "\n",
+ " # Display the HTML in the notebook\n",
+ " display(HTML(styled_html))\n",
+ "\n",
+ "\n",
+ "def split_gcs_path(gcs_path: str) -> tuple[str, str]:\n",
+ " \"\"\"Splits a full GCS path into bucket name and prefix.\"\"\"\n",
+ " if gcs_path.startswith(\"gs://\"):\n",
+ " path_without_scheme = gcs_path[5:] # Remove the 'gs://' part\n",
+ " parts = path_without_scheme.split(\"/\", 1)\n",
+ " bucket_name = parts[0]\n",
+ " prefix = parts[1] if len(parts) > 1 else \"\"\n",
+ " return bucket_name, prefix\n",
+ " else:\n",
+ " raise ValueError(\"Invalid GCS path. Must start with 'gs://'\")\n",
+ "\n",
+ "\n",
+ "def list_gcs_objects(full_gcs_path: str) -> list[str]:\n",
+ " \"\"\"Lists all the objects in the given GCS path.\"\"\"\n",
+ " bucket_name, prefix = split_gcs_path(full_gcs_path)\n",
+ " storage_client = storage.Client()\n",
+ " bucket = storage_client.bucket(bucket_name)\n",
+ " blobs = bucket.list_blobs(\n",
+ " prefix=prefix\n",
+ " ) # List all objects that start with the prefix\n",
+ "\n",
+ " return [blob.name for blob in blobs]\n",
+ "\n",
+ "\n",
+ "def find_directories_with_files(\n",
+ " full_gcs_path: str, required_files: list[str]\n",
+ ") -> list[str]:\n",
+ " \"\"\"Finds directories containing specific files under the given full GCS path.\"\"\"\n",
+ " bucket_name, prefix = split_gcs_path(full_gcs_path)\n",
+ " all_paths = list_gcs_objects(f\"gs://{bucket_name}/{prefix}\")\n",
+ " directories = set()\n",
+ "\n",
+ " # Create a dictionary to track files found in each directory\n",
+ " file_presence = {}\n",
+ " for path in all_paths:\n",
+ " directory = \"/\".join(path.split(\"/\")[:-1]) # Get the directory part of the path\n",
+ " filename = path.split(\"/\")[-1] # Get the filename part of the path\n",
+ " if directory:\n",
+ " if directory not in file_presence:\n",
+ " file_presence[directory] = set()\n",
+ " file_presence[directory].add(filename)\n",
+ "\n",
+ " # Check which directories have all required files\n",
+ " for directory, files in file_presence.items():\n",
+ " if all(file in files for file in required_files):\n",
+ " directories.add(f\"gs://{bucket_name}/{directory}\")\n",
+ "\n",
+ " return list(directories)\n",
+ "\n",
+ "\n",
+ "def extract_metric_name(metric_string: str):\n",
+ " # Use a regular expression to find the metric name\n",
+ " match = re.search(r\"\\.(\\w+)/\", metric_string)\n",
+ " # Return the matched group if found\n",
+ " return match.group(1) if match else metric_string\n",
+ "\n",
+ "\n",
+ "def read_file_from_gcs(filename: str):\n",
+ " with gfile.GFile(filename, \"r\") as f:\n",
+ " return f.read()\n",
+ "\n",
+ "\n",
+ "def process_results(df: pd.DataFrame) -> pd.DataFrame:\n",
+ " \"\"\"Process the results removing columns that could be confusing.\"\"\"\n",
+ " columns_to_drop = []\n",
+ " # Dropping columns that could be confusing.\n",
+ " for col in df.columns:\n",
+ " if \"confidence\" in col:\n",
+ " columns_to_drop.append(col)\n",
+ " if \"raw_eval_resp\" in col:\n",
+ " columns_to_drop.append(col)\n",
+ " if col == \"instruction\":\n",
+ " columns_to_drop.append(col)\n",
+ " if col == \"context\":\n",
+ " columns_to_drop.append(col)\n",
+ " return df.drop(columns=columns_to_drop)\n",
+ "\n",
+ "\n",
+ "class ResultsUI:\n",
+ " \"\"\"A UI to display the results of a VAPO run.\"\"\"\n",
+ "\n",
+ " def __init__(self, path: str):\n",
+ " required_files = [\"eval_results.json\", \"templates.json\"]\n",
+ " runs = find_directories_with_files(path, required_files)\n",
+ "\n",
+ " self.run_label = widgets.Label(\"Select Run:\")\n",
+ " self.run_dropdrown = widgets.Dropdown(\n",
+ " options=runs, value=runs[0], layout=widgets.Layout(width=\"200px\")\n",
+ " )\n",
+ " self.run_dropdrown.observe(self.display_run_handler, names=\"value\")\n",
+ "\n",
+ " # Create a label widget for the description\n",
+ " self.dropdown_description = widgets.Label(\"Select Template:\")\n",
+ " self.template_dropdown = widgets.Dropdown(\n",
+ " options=[],\n",
+ " value=None,\n",
+ " layout=widgets.Layout(width=\"400px\"),\n",
+ " disabled=True,\n",
+ " )\n",
+ " self.template_dropdown.observe(self.display_template_handler, names=\"value\")\n",
+ " self.results_output = widgets.Output(\n",
+ " layout=widgets.Layout(\n",
+ " height=\"600px\", overflow=\"auto\", margin=\"20px 0px 0px 0px\"\n",
+ " )\n",
+ " )\n",
+ " self.display_run(runs[0])\n",
+ "\n",
+ " def display_template_handler(self, change: dict[str, str]) -> None:\n",
+ " \"\"\"Display the template and the corresponding evaluation results.\"\"\"\n",
+ " if change[\"new\"] is None:\n",
+ " return\n",
+ " df_index = int(change[\"new\"].split(\" \")[1])\n",
+ " self.display_eval_results(df_index)\n",
+ "\n",
+ " def display_run_handler(self, change) -> None:\n",
+ " if change[\"new\"] is None:\n",
+ " return\n",
+ "\n",
+ " path = change[\"new\"]\n",
+ " self.display_run(path)\n",
+ "\n",
+ " def display_run(self, path: str) -> None:\n",
+ " \"\"\"Display the results of a VAPO run.\"\"\"\n",
+ " self.run_dropdrown.disabled = True\n",
+ " filename = f\"{path}/eval_results.json\"\n",
+ " eval_results = json.loads(read_file_from_gcs(filename))\n",
+ "\n",
+ " filename = f\"{path}/templates.json\"\n",
+ " templates = json.loads(read_file_from_gcs(filename))\n",
+ "\n",
+ " if len(templates) == len(eval_results):\n",
+ " offset = 0\n",
+ " elif len(templates) == len(eval_results) + 1:\n",
+ " # In some setups it is possible to have 1 more template than results.\n",
+ " offset = 1\n",
+ " else:\n",
+ " raise ValueError(\n",
+ " \"Number of templates doesn't match number of eval results\"\n",
+ " f\" {len(templates)} vs {len(eval_results)}\"\n",
+ " )\n",
+ " self.templates = [\n",
+ " pd.json_normalize(template) for template in templates[offset:]\n",
+ " ]\n",
+ " metric_columns = [col for col in self.templates[0].columns if \"metric\" in col]\n",
+ "\n",
+ " self.eval_results = [\n",
+ " process_results(pd.read_json(StringIO(result[\"metrics_table\"])))\n",
+ " for result in eval_results\n",
+ " ]\n",
+ " options = []\n",
+ " for i, template in enumerate(self.templates):\n",
+ " metrics = []\n",
+ " for col in metric_columns:\n",
+ " value = template[col].tolist()[0]\n",
+ " short_col = extract_metric_name(col)\n",
+ " metrics.append(f\"{short_col}: {value}\")\n",
+ " metrics_str = \" \".join(metrics)\n",
+ " options.append(f\"Template {i} {metrics_str}\")\n",
+ "\n",
+ " self.template_dropdown.disabled = False\n",
+ " self.template_dropdown.options = options\n",
+ " self.run_dropdrown.disabled = False\n",
+ "\n",
+ " def display_eval_results(self, index: int) -> None:\n",
+ " \"\"\"Display the evaluation results for a specific template.\"\"\"\n",
+ " with self.results_output:\n",
+ " self.results_output.clear_output(wait=True) # Clear previous output\n",
+ " display_dataframe(self.templates[index])\n",
+ " print()\n",
+ " display_dataframe(self.eval_results[index])\n",
+ "\n",
+ " def get_container(self) -> widgets.Output:\n",
+ " \"\"\"Get the container widget for the results UI.\"\"\"\n",
+ " return widgets.VBox(\n",
+ " [\n",
+ " self.run_label,\n",
+ " self.run_dropdrown,\n",
+ " self.dropdown_description,\n",
+ " self.template_dropdown,\n",
+ " self.results_output,\n",
+ " ]\n",
+ " )"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "-p59jd5rOp4q"
+ },
+ "source": [
+ "# Step 1: Configure your prompt template\n",
+ "Prompts consist of two key parts:\n",
+ "* System Instruction (SI) Template: A fixed instruction shared across all queries for a given task.\n",
+ "* Task/Context Template: A dynamic part that changes based on the task.\n",
+ "\n",
+ "APD enables the translation and optimization of the System Instruction Template, while the Task/Context Template remains essential for evaluating different SI templates."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "rJG1pVZO317x"
+ },
+ "outputs": [],
+ "source": [
+ "SYSTEM_INSTRUCTION = \"Answer the following question. Let's think step by step.\\n\" # @param {type:\"string\"}\n",
+ "PROMPT_TEMPLATE = (\n",
+ " \"Question: {{question}}\\n\\nAnswer:{{target}}\" # @param {type:\"string\"}\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5y-cmg0TQP6v"
+ },
+ "source": [
+ "# Step 2: Input your data\n",
+ "To optimize the model, provide a CSV or JSONL file containing labeled validation samples\n",
+ "* Focus on examples that specifically demonstrate the issues you want to address.\n",
+ "* Recommendation: Use 50-100 distinct samples for reliable results. However, the tool can still be effective with as few as 5 samples.\n",
+ "\n",
+ "For prompt translation:\n",
+ "* Consider using the source model to label examples that the target model struggles with, helping to identify areas for improvement.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "mfgi_oR6tTIB"
+ },
+ "outputs": [],
+ "source": [
+ "# @markdown **Project setup**:
\n",
+ "PROJECT_ID = \"[YOUR_PROJECT]\" # @param {type:\"string\"}\n",
+ "LOCATION = \"us-central1\" # @param {type:\"string\"}\n",
+ "OUTPUT_PATH = \"[OUTPUT_PATH]\" # @param {type:\"string\"}\n",
+ "# @markdown * GCS path of your bucket, e.g., gs://prompt_translation_demo, used to store all artifacts.\n",
+ "INPUT_DATA_PATH = \"[INPUT_DATA_PATH]\" # @param {type:\"string\"}\n",
+ "# @markdown * Specify a GCS path for the input data, e.g., gs://prompt_translation_demo/input_data.jsonl."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ucebZHkHRxKH"
+ },
+ "source": [
+ "# Step 3: Configure optimization settings\n",
+ "The optimization configs are defaulted to the values that are most commonly used and which we recommend using initially."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "B2R3P8mMvK9q"
+ },
+ "outputs": [],
+ "source": [
+ "TARGET_MODEL = \"gemini-1.5-flash-001\" # @param [\"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\"]\n",
+ "SOURCE_MODEL = \"\" # @param [\"\", \"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"]\n",
+ "# @markdown * If set, it will be used to generate ground truth responses for the input examples. This is useful to migrate the prompt from a source model.\n",
+ "OPTIMIZATION_MODE = \"instruction_and_demo\" # @param [\"instruction\", \"demonstration\", \"instruction_and_demo\"]\n",
+ "OPTIMIZATION_METRIC = \"question_answering_correctness\" # @param [\"bleu\", \"coherence\", \"exact_match\", \"fluency\", \"groundedness\", \"text_quality\", \"verbosity\", \"rouge_1\", \"rouge_2\", \"rouge_l\", \"rouge_l_sum\", \"safety\", \"question_answering_correctness\", \"question_answering_quality\", \"summarization_quality\", \"tool_name_match\", \"tool_parameter_key_match\", \"tool_parameter_kv_match\", \"tool_call_valid\"] {type:\"string\"}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "kO7fO0qTSNLs"
+ },
+ "source": [
+ "# Step 4: Configure advanced optimization settings [Optional]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "fRHHTpaV4Xyo"
+ },
+ "outputs": [],
+ "source": [
+ "# @markdown **Instruction Optimization Configs**:
\n",
+ "NUM_INST_OPTIMIZATION_STEPS = 10 # @param {type:\"integer\"}\n",
+ "NUM_TEMPLATES_PER_STEP = 2 # @param {type:\"integer\"}\n",
+ "# @markdown * Number of prompt templates generated and evaluated at each optimization step.\n",
+ "\n",
+ "# @markdown **Demonstration Optimization Configs**:
\n",
+ "NUM_DEMO_OPTIMIZATION_STEPS = 10 # @param {type:\"integer\"}\n",
+ "NUM_DEMO_PER_PROMPT = 3 # @param {type:\"integer\"}\n",
+ "# @markdown * Number of the demonstrations to include in each prompt.\n",
+ "\n",
+ "# @markdown **Model Configs**:
\n",
+ "TARGET_MODEL_QPS = 3 # @param {type:\"integer\"}\n",
+ "SOURCE_MODEL_QPS = 3 # @param {type:\"integer\"}\n",
+ "OPTIMIZER_MODEL = \"gemini-1.5-flash-001\" # @param [\"gemini-1.0-pro-001\", \"gemini-1.0-pro-002\", \"gemini-1.5-flash-001\", \"gemini-1.5-pro-001\", \"gemini-1.0-ultra-001\", \"text-bison@001\", \"text-bison@002\", \"text-bison32k@002\", \"text-unicorn@001\"]\n",
+ "# @markdown * The model used to generated alternative prompts in the instruction optimization mode.\n",
+ "OPTIMIZER_MODEL_QPS = 3 # @param {type:\"integer\"}\n",
+ "EVAL_MODEL_QPS = 3 # @param {type:\"integer\"}\n",
+ "# @markdown * The QPS for calling the eval model, which is currently gemini-1.5-pro-001.\n",
+ "\n",
+ "# @markdown **Multi-metric Configs**:
\n",
+ "# @markdown Use this section only if you need more than one metric for optimization. This will override the metric you picked above.\n",
+ "OPTIMIZATION_METRIC_1 = \"NA\" # @param [\"NA\", \"bleu\", \"coherence\", \"exact_match\", \"fluency\", \"groundedness\", \"text_quality\", \"verbosity\", \"rouge_1\", \"rouge_2\", \"rouge_l\", \"rouge_l_sum\", \"safety\", \"question_answering_correctness\", \"question_answering_quality\", \"summarization_quality\", \"tool_name_match\", \"tool_parameter_key_match\", \"tool_parameter_kv_match\", \"tool_call_valid\"] {type:\"string\"}\n",
+ "OPTIMIZATION_METRIC_1_WEIGHT = 0.0 # @param {type:\"number\"}\n",
+ "OPTIMIZATION_METRIC_2 = \"NA\" # @param [\"NA\", \"bleu\", \"coherence\", \"exact_match\", \"fluency\", \"groundedness\", \"text_quality\", \"verbosity\", \"rouge_1\", \"rouge_2\", \"rouge_l\", \"rouge_l_sum\", \"safety\", \"question_answering_correctness\", \"question_answering_quality\", \"summarization_quality\", \"tool_name_match\", \"tool_parameter_key_match\", \"tool_parameter_kv_match\", \"tool_call_valid\"] {type:\"string\"}\n",
+ "OPTIMIZATION_METRIC_2_WEIGHT = 0.0 # @param {type:\"number\"}\n",
+ "OPTIMIZATION_METRIC_3 = \"NA\" # @param [\"NA\", \"bleu\", \"coherence\", \"exact_match\", \"fluency\", \"groundedness\", \"text_quality\", \"verbosity\", \"rouge_1\", \"rouge_2\", \"rouge_l\", \"rouge_l_sum\", \"safety\", \"question_answering_correctness\", \"question_answering_quality\", \"summarization_quality\", \"tool_name_match\", \"tool_parameter_key_match\", \"tool_parameter_kv_match\", \"tool_call_valid\"] {type:\"string\"}\n",
+ "OPTIMIZATION_METRIC_3_WEIGHT = 0.0 # @param {type:\"number\"}\n",
+ "METRIC_AGGREGATION_TYPE = \"weighted_sum\" # @param [\"weighted_sum\", \"weighted_average\"]\n",
+ "\n",
+ "# @markdown **Misc Configs**:
\n",
+ "PLACEHOLDER_TO_VALUE = \"{}\" # @param\n",
+ "# @markdown * This variable is used for long prompt optimization to not optimize parts of prompt identified by placeholders. It provides a mapping from the placeholder variables to their content. See link for details.\n",
+ "RESPONSE_MIME_TYPE = \"application/json\" # @param [\"text/plain\", \"application/json\"]\n",
+ "# @markdown * This variable determines the format of the output for the target model. See link for details.\n",
+ "TARGET_LANGUAGE = \"English\" # @param [\"English\", \"French\", \"German\", \"Hebrew\", \"Hindi\", \"Japanese\", \"Korean\", \"Portuguese\", \"Simplified Chinese\", \"Spanish\", \"Traditional Chinese\"]\n",
+ "# @markdown * The language of the system instruction."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "X7Mgb0EHSSFk"
+ },
+ "source": [
+ "# Step 5: Run Prompt Optimizer"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Z8NvNLTfxPTf"
+ },
+ "outputs": [],
+ "source": [
+ "timestamp = datetime.datetime.now().strftime(\"%Y-%m-%dT%H:%M:%S\")\n",
+ "display_name = f\"pt_{timestamp}\"\n",
+ "\n",
+ "in_colab_enterprise = \"GOOGLE_CLOUD_PROJECT\" in os.environ\n",
+ "if not in_colab_enterprise:\n",
+ " gc = authenticate()\n",
+ "\n",
+ "label_enforced = is_run_target_required(\n",
+ " [\n",
+ " OPTIMIZATION_METRIC,\n",
+ " OPTIMIZATION_METRIC_1,\n",
+ " OPTIMIZATION_METRIC_2,\n",
+ " OPTIMIZATION_METRIC_3,\n",
+ " ],\n",
+ " SOURCE_MODEL,\n",
+ ")\n",
+ "input_data_path = f\"{INPUT_DATA_PATH}\"\n",
+ "validate_prompt_and_data(\n",
+ " \"\\n\".join([SYSTEM_INSTRUCTION, PROMPT_TEMPLATE]),\n",
+ " input_data_path,\n",
+ " PLACEHOLDER_TO_VALUE,\n",
+ " label_enforced,\n",
+ ")\n",
+ "\n",
+ "output_path = f\"{OUTPUT_PATH}/{display_name}\"\n",
+ "\n",
+ "params = {\n",
+ " \"project\": PROJECT_ID,\n",
+ " \"num_steps\": NUM_INST_OPTIMIZATION_STEPS,\n",
+ " \"prompt_template\": SYSTEM_INSTRUCTION,\n",
+ " \"demo_and_query_template\": PROMPT_TEMPLATE,\n",
+ " \"target_model\": TARGET_MODEL,\n",
+ " \"target_model_qps\": TARGET_MODEL_QPS,\n",
+ " \"target_model_location\": LOCATION,\n",
+ " \"source_model\": SOURCE_MODEL,\n",
+ " \"source_model_qps\": SOURCE_MODEL_QPS,\n",
+ " \"source_model_location\": LOCATION,\n",
+ " \"eval_model_qps\": EVAL_MODEL_QPS,\n",
+ " \"eval_model_location\": LOCATION,\n",
+ " \"optimization_mode\": OPTIMIZATION_MODE,\n",
+ " \"num_demo_set_candidates\": NUM_DEMO_OPTIMIZATION_STEPS,\n",
+ " \"demo_set_size\": NUM_DEMO_PER_PROMPT,\n",
+ " \"aggregation_type\": METRIC_AGGREGATION_TYPE,\n",
+ " \"data_limit\": 50,\n",
+ " \"optimizer_model\": OPTIMIZER_MODEL,\n",
+ " \"optimizer_model_qps\": OPTIMIZER_MODEL_QPS,\n",
+ " \"optimizer_model_location\": LOCATION,\n",
+ " \"num_template_eval_per_step\": NUM_TEMPLATES_PER_STEP,\n",
+ " \"input_data_path\": input_data_path,\n",
+ " \"output_path\": output_path,\n",
+ " \"response_mime_type\": RESPONSE_MIME_TYPE,\n",
+ " \"language\": TARGET_LANGUAGE,\n",
+ " \"placeholder_to_content\": json.loads(PLACEHOLDER_TO_VALUE),\n",
+ "}\n",
+ "\n",
+ "if OPTIMIZATION_METRIC_1 == \"NA\":\n",
+ " params[\"eval_metrics_types\"] = [OPTIMIZATION_METRIC]\n",
+ " params[\"eval_metrics_weights\"] = [1.0]\n",
+ "else:\n",
+ " metrics = []\n",
+ " weights = []\n",
+ " for metric in [OPTIMIZATION_METRIC_1, OPTIMIZATION_METRIC_2, OPTIMIZATION_METRIC_3]:\n",
+ " if metric == \"NA\":\n",
+ " break\n",
+ " metrics.append(metric)\n",
+ " weights.append(OPTIMIZATION_METRIC_1_WEIGHT)\n",
+ " params[\"eval_metrics_types\"] = metrics\n",
+ " params[\"eval_metrics_weights\"] = weights\n",
+ "\n",
+ "job = run_apd(params, OUTPUT_PATH, display_name)\n",
+ "print(f\"Job ID: {job.name}\")\n",
+ "\n",
+ "progress_form = ProgressForm()\n",
+ "while progress_form.monitor_progress(job, params):\n",
+ " time.sleep(5)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "lo5mcTzwSgBP"
+ },
+ "source": [
+ "# Step 6: Inspect the Results\n",
+ "You can use the following cell to inspect all the predictions made by all the\n",
+ "generated templates during one or multiple VAPO runs."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "1x6HSty759jY"
+ },
+ "outputs": [],
+ "source": [
+ "RESULT_PATH = \"[GCS_PATH]\" # @param {type:\"string\"}\n",
+ "# @markdown * Specify a GCS path that contains artifacts of a single or multiple VAPO runs.\n",
+ "\n",
+ "results_ui = ResultsUI(RESULT_PATH)\n",
+ "\n",
+ "results_df_html = \"\"\"\n",
+ "\n",
+ "\"\"\"\n",
+ "\n",
+ "display(HTML(results_df_html))\n",
+ "display(results_ui.get_container())"
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "name": "vertex_ai_prompt_optimizer_ui.ipynb",
+ "toc_visible": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
From f0eef416916afe2f1d478c187bfafff580494727 Mon Sep 17 00:00:00 2001
From: nhootan <103317089+nhootan@users.noreply.github.com>
Date: Wed, 18 Sep 2024 14:31:44 -0400
Subject: [PATCH 08/10] fix: Adding the links table at the top of the VAPO
notebook. (#1134)
# Description
Adding the links table to the top of the VAPO notebook.
---------
Co-authored-by: hootan
Co-authored-by: Owl Bot
Co-authored-by: Holt Skinner
---
.../vertex_ai_prompt_optimizer_ui.ipynb | 30 +++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb
index aacc919c5bd..ee1005f290c 100644
--- a/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb
+++ b/gemini/prompts/prompt_optimizer/vertex_ai_prompt_optimizer_ui.ipynb
@@ -23,6 +23,36 @@
"# limitations under the License."
]
},
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "RN8N3O43QDT5"
+ },
+ "source": [
+ "\n",
+ " \n",
+ " \n",
+ " Open in Colab\n",
+ " \n",
+ " | \n",
+ " \n",
+ " \n",
+ " Open in Colab Enterprise\n",
+ " \n",
+ " | \n",
+ " \n",
+ " \n",
+ " Open in Vertex AI Workbench\n",
+ " \n",
+ " | \n",
+ " \n",
+ " \n",
+ " View on GitHub\n",
+ " \n",
+ " | \n",
+ "
"
+ ]
+ },
{
"cell_type": "markdown",
"metadata": {
From 9de33214c15bc7b6a4d084786bd57af90b12f0ad Mon Sep 17 00:00:00 2001
From: Kristopher Overholt
Date: Thu, 19 Sep 2024 11:02:48 -0500
Subject: [PATCH 09/10] feat: Improve error catching in the SQL Talk app
(Gemini Function Calling) (#1136)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
# Description
This PR adds improved error catching to the SQL Talk app. Currently, if
an error is encountered due to a malformed generated SQL query or an
error executing the tools / functions, then a full stack trace will
appear in the app. With this PR, errors are caught at the SQL execution
level and top-level application and rendered in the app (and persisted
in the message history) without a full stack trace.
Special thanks to @mona19 for suggesting this change and providing a
sample implementation! 🙏
How friendlier errors will appear now instead of a full stack trace:
---
![Screenshot 2024-09-18 at 6 21
37 PM](https://github.com/user-attachments/assets/44c45e96-095e-4373-ab23-a6fde871b6e0)
---
![Screenshot 2024-09-18 at 6 22
11 PM](https://github.com/user-attachments/assets/b47cba62-6ee2-4ddf-8832-9f4e4643d0df)
---------
Co-authored-by: Owl Bot
---
gemini/function-calling/sql-talk-app/app.py | 260 +++++++++++---------
1 file changed, 148 insertions(+), 112 deletions(-)
diff --git a/gemini/function-calling/sql-talk-app/app.py b/gemini/function-calling/sql-talk-app/app.py
index c7ecda501b0..4661a27b0ea 100644
--- a/gemini/function-calling/sql-talk-app/app.py
+++ b/gemini/function-calling/sql-talk-app/app.py
@@ -140,131 +140,167 @@
from BigQuery, do not make up information.
"""
- response = chat.send_message(prompt)
- response = response.candidates[0].content.parts[0]
-
- print(response)
+ try:
+ response = chat.send_message(prompt)
+ response = response.candidates[0].content.parts[0]
- api_requests_and_responses = []
- backend_details = ""
+ print(response)
- function_calling_in_process = True
- while function_calling_in_process:
- try:
- params = {}
- for key, value in response.function_call.args.items():
- params[key] = value
+ api_requests_and_responses = []
+ backend_details = ""
- print(response.function_call.name)
- print(params)
+ function_calling_in_process = True
+ while function_calling_in_process:
+ try:
+ params = {}
+ for key, value in response.function_call.args.items():
+ params[key] = value
- if response.function_call.name == "list_datasets":
- api_response = client.list_datasets()
- api_response = BIGQUERY_DATASET_ID
- api_requests_and_responses.append(
- [response.function_call.name, params, api_response]
- )
-
- if response.function_call.name == "list_tables":
- api_response = client.list_tables(params["dataset_id"])
- api_response = str([table.table_id for table in api_response])
- api_requests_and_responses.append(
- [response.function_call.name, params, api_response]
- )
+ print(response.function_call.name)
+ print(params)
- if response.function_call.name == "get_table":
- api_response = client.get_table(params["table_id"])
- api_response = api_response.to_api_repr()
- api_requests_and_responses.append(
- [
- response.function_call.name,
- params,
- [
- str(api_response.get("description", "")),
- str(
- [
- column["name"]
- for column in api_response["schema"]["fields"]
- ]
- ),
- ],
- ]
- )
- api_response = str(api_response)
-
- if response.function_call.name == "sql_query":
- job_config = bigquery.QueryJobConfig(
- maximum_bytes_billed=100000000
- ) # Data limit per query job
- try:
- cleaned_query = (
- params["query"]
- .replace("\\n", " ")
- .replace("\n", "")
- .replace("\\", "")
- )
- query_job = client.query(cleaned_query, job_config=job_config)
- api_response = query_job.result()
- api_response = str([dict(row) for row in api_response])
- api_response = api_response.replace("\\", "").replace("\n", "")
+ if response.function_call.name == "list_datasets":
+ api_response = client.list_datasets()
+ api_response = BIGQUERY_DATASET_ID
api_requests_and_responses.append(
[response.function_call.name, params, api_response]
)
- except Exception as e:
- api_response = f"{str(e)}"
+
+ if response.function_call.name == "list_tables":
+ api_response = client.list_tables(params["dataset_id"])
+ api_response = str([table.table_id for table in api_response])
api_requests_and_responses.append(
[response.function_call.name, params, api_response]
)
- print(api_response)
-
- response = chat.send_message(
- Part.from_function_response(
- name=response.function_call.name,
- response={
- "content": api_response,
- },
- ),
- )
- response = response.candidates[0].content.parts[0]
-
- backend_details += "- Function call:\n"
- backend_details += (
- " - Function name: ```"
- + str(api_requests_and_responses[-1][0])
- + "```"
- )
- backend_details += "\n\n"
- backend_details += (
- " - Function parameters: ```"
- + str(api_requests_and_responses[-1][1])
- + "```"
- )
- backend_details += "\n\n"
- backend_details += (
- " - API response: ```"
- + str(api_requests_and_responses[-1][2])
- + "```"
- )
- backend_details += "\n\n"
- with message_placeholder.container():
- st.markdown(backend_details)
+ if response.function_call.name == "get_table":
+ api_response = client.get_table(params["table_id"])
+ api_response = api_response.to_api_repr()
+ api_requests_and_responses.append(
+ [
+ response.function_call.name,
+ params,
+ [
+ str(api_response.get("description", "")),
+ str(
+ [
+ column["name"]
+ for column in api_response["schema"][
+ "fields"
+ ]
+ ]
+ ),
+ ],
+ ]
+ )
+ api_response = str(api_response)
+
+ if response.function_call.name == "sql_query":
+ job_config = bigquery.QueryJobConfig(
+ maximum_bytes_billed=100000000
+ ) # Data limit per query job
+ try:
+ cleaned_query = (
+ params["query"]
+ .replace("\\n", " ")
+ .replace("\n", "")
+ .replace("\\", "")
+ )
+ query_job = client.query(
+ cleaned_query, job_config=job_config
+ )
+ api_response = query_job.result()
+ api_response = str([dict(row) for row in api_response])
+ api_response = api_response.replace("\\", "").replace(
+ "\n", ""
+ )
+ api_requests_and_responses.append(
+ [response.function_call.name, params, api_response]
+ )
+ except Exception as e:
+ error_message = f"""
+ We're having trouble running this SQL query. This
+ could be due to an invalid query or the structure of
+ the data. Try rephrasing your question to help the
+ model generate a valid query. Details:
+
+ {str(e)}"""
+ st.error(error_message)
+ api_response = error_message
+ api_requests_and_responses.append(
+ [response.function_call.name, params, api_response]
+ )
+ st.session_state.messages.append(
+ {
+ "role": "assistant",
+ "content": error_message,
+ }
+ )
+
+ print(api_response)
+
+ response = chat.send_message(
+ Part.from_function_response(
+ name=response.function_call.name,
+ response={
+ "content": api_response,
+ },
+ ),
+ )
+ response = response.candidates[0].content.parts[0]
+
+ backend_details += "- Function call:\n"
+ backend_details += (
+ " - Function name: ```"
+ + str(api_requests_and_responses[-1][0])
+ + "```"
+ )
+ backend_details += "\n\n"
+ backend_details += (
+ " - Function parameters: ```"
+ + str(api_requests_and_responses[-1][1])
+ + "```"
+ )
+ backend_details += "\n\n"
+ backend_details += (
+ " - API response: ```"
+ + str(api_requests_and_responses[-1][2])
+ + "```"
+ )
+ backend_details += "\n\n"
+ with message_placeholder.container():
+ st.markdown(backend_details)
- except AttributeError:
- function_calling_in_process = False
+ except AttributeError:
+ function_calling_in_process = False
- time.sleep(3)
+ time.sleep(3)
- full_response = response.text
- with message_placeholder.container():
- st.markdown(full_response.replace("$", r"\$")) # noqa: W605
- with st.expander("Function calls, parameters, and responses:"):
- st.markdown(backend_details)
+ full_response = response.text
+ with message_placeholder.container():
+ st.markdown(full_response.replace("$", r"\$")) # noqa: W605
+ with st.expander("Function calls, parameters, and responses:"):
+ st.markdown(backend_details)
- st.session_state.messages.append(
- {
- "role": "assistant",
- "content": full_response,
- "backend_details": backend_details,
- }
- )
+ st.session_state.messages.append(
+ {
+ "role": "assistant",
+ "content": full_response,
+ "backend_details": backend_details,
+ }
+ )
+ except Exception as e:
+ print(e)
+ error_message = f"""
+ Something went wrong! We encountered an unexpected error while
+ trying to process your request. Please try rephrasing your
+ question. Details:
+
+ {str(e)}"""
+ st.error(error_message)
+ st.session_state.messages.append(
+ {
+ "role": "assistant",
+ "content": error_message,
+ }
+ )
From f67f1afd296f8116e0b3de459a9e324b3b7c9965 Mon Sep 17 00:00:00 2001
From: Mend Renovate
Date: Thu, 19 Sep 2024 20:15:39 +0200
Subject: [PATCH 10/10] chore(deps): update dependency faker to v29 (#1140)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This PR contains the following updates:
| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [faker](https://redirect.github.com/joke2k/faker)
([changelog](https://redirect.github.com/joke2k/faker/blob/master/CHANGELOG.md))
| `26.0.0` -> `29.0.0` |
[![age](https://developer.mend.io/api/mc/badges/age/pypi/faker/29.0.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/pypi/faker/29.0.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/pypi/faker/26.0.0/29.0.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/pypi/faker/26.0.0/29.0.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
---
> [!WARNING]
> Some dependencies could not be looked up. Check the warning logs for
more information.
---
### Release Notes
joke2k/faker (faker)
###
[`v29.0.0`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v2900---2024-09-19)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v28.4.1...v29.0.0)
- Fix `pydecimal` distribution when called with a range across `0`.
Thanks [@AlexLitvino](https://redirect.github.com/AlexLitvino).
###
[`v28.4.1`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v2841---2024-09-04)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v28.4.0...v28.4.1)
- Fix issue where Faker does not properly convert min/max float values
to `Decimal`. Thanks
[@bdjellabaldebaran](https://redirect.github.com/bdjellabaldebaran).
###
[`v28.4.0`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v2840---2024-09-04)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v28.3.0...v28.4.0)
- Add `it_IT` lorem provider. Thanks
[@gianni-di-noia](https://redirect.github.com/gianni-di-noia).
###
[`v28.3.0`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v2830---2024-09-04)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v28.2.0...v28.3.0)
- Fix male forms of female surnames in `uk_UA`.Thanks
[@AlexLitvino](https://redirect.github.com/AlexLitvino).
###
[`v28.2.0`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v2820---2024-09-04)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v28.1.0...v28.2.0)
- Add `es_ES` isbn provider. Thanks
[@mondeja](https://redirect.github.com/mondeja).
###
[`v28.1.0`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v2810---2024-08-30)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v28.0.0...v28.1.0)
- Fix Incorrect City Spelling in `uk_UA` locale. Thanks
[@ch4zzy](https://redirect.github.com/ch4zzy).
###
[`v28.0.0`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v2800---2024-08-23)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v27.4.0...v28.0.0)
- Fix `pydecimal` handling of `positive` keyword. Thanks
[@tahzeer](https://redirect.github.com/tahzeer).
###
[`v27.4.0`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v2740---2024-08-21)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v27.3.0...v27.4.0)
- Add person provider for `pk_PK` locale. Thanks
[@c2-tlhah](https://redirect.github.com/c2-tlhah)
###
[`v27.3.0`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v2730---2024-08-21)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v27.2.0...v27.3.0)
- Add providers for `vi_VN` locale. Thanks
[@ntd1683](https://redirect.github.com/ntd1683).
###
[`v27.2.0`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v2720---2024-08-21)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v27.1.0...v27.2.0)
- Split names in `en_IN` person provider. Thanks
[@wh0th3h3llam1](https://redirect.github.com/wh0th3h3llam1).
###
[`v27.1.0`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v2710---2024-08-21)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v27.0.0...v27.1.0)
- Add address providoer for `en_MS` local. Thanks
[@carlosfunk](https://redirect.github.com/carlosfunk).
###
[`v27.0.0`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v2700---2024-08-12)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v26.3.0...v27.0.0)
- Re-introduce `part_of_speech` argument to `words()` method.
###
[`v26.3.0`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v2630---2024-08-08)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v26.2.0...v26.3.0)
- Extend `ro_RO` company localization with prefixes. Thanks
[@DDSNA](https://redirect.github.com/DDSNA).
###
[`v26.2.0`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v2620---2024-08-06)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v26.1.0...v26.2.0)
- Add Swahili (`sw`) provider for generating Swahili names. Thanks
[@5uru](https://redirect.github.com/5uru).
###
[`v26.1.0`](https://redirect.github.com/joke2k/faker/blob/HEAD/CHANGELOG.md#v2610---2024-08-01)
[Compare
Source](https://redirect.github.com/joke2k/faker/compare/v26.0.0...v26.1.0)
- Add more entries to `sk_SK` Geo provider. Thanks
[@george0st](https://redirect.github.com/george0st).
---
### Configuration
📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).
🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.
♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.
🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.
---
- [ ] If you want to rebase/retry this PR, check
this box
---
This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/GoogleCloudPlatform/generative-ai).
---
gemini/sample-apps/llamaindex-rag/pyproject.toml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gemini/sample-apps/llamaindex-rag/pyproject.toml b/gemini/sample-apps/llamaindex-rag/pyproject.toml
index 317ee5d197f..3869011ab7d 100644
--- a/gemini/sample-apps/llamaindex-rag/pyproject.toml
+++ b/gemini/sample-apps/llamaindex-rag/pyproject.toml
@@ -59,7 +59,7 @@ dulwich = "0.21.7"
email-validator = "2.2.0"
entrypoints = "0.4"
exceptiongroup = "1.2.2"
-faker = "26.0.0"
+faker = "29.0.0"
fastapi = "0.111.1"
fastapi-cli = "0.0.4"
fastjsonschema = "2.20.0"