Add vertex evaluator information to docs (#1) · firebase/genkit@ca23579

Commit

Add vertex evaluator information to docs (#1)

* Fix model routing (#161)

* [UI] Add new span tree + viewer to Flow details page (#164)

* Fetch models from API (#174)

* Backend errors (#163)

Display errors in the Prompt Playground component after receiving issues from backend

* [UI] Cleanup unimplemented pages from navbar (#180)

* [UI] Increase max-height of flow input/output (#179)

Also update styles for running + error statues in output box.

* Move flow runner to Actions page (#176)

* [UI] Fix overflow of execution span tree (#183)

* Input validation disables prompt run button (#182)

Input validation for prompt playground

* Route playground from flows to action runners page (#191)

* Switch temperature to the slider (#195)

* Show validation errors on the playground (#196)

* [UI] Revamp flow details page layout (#197)

* Fix validator issues (#194)

* [UI] Initial design of span details view (#199)

* Move flow runner to start from action-list instead of action-runner (#200)

* Add vertex-ai to the model playground (#201)

Also add icons for all known action types

* [UI] Hide input/output pre if none available (#204)

* [UI] Add "muted" helper class for secondary text (#206)

* Don't send blank stop sequences to the model, vertex gemini model doesn't like it (#217)

* Provider specific model param restrictions on input (#224)

* Use the minfied version of Monaco Editor in the angular app (#242)

* [UI] Update app name

* [UI] Update flow details layout (#246)

Also adds new `<expand-text>` shared component which adds a button to show text in a larger pop-up dialog.

* [UI] Add callout component (#244)

* [UI] Hide wrapper spans on details page (#254)

* [UI] Update flow durations on details page (#256)

* [UI] Show error on flow details page (#258)

* Playground load trace (#262)

* Code cleanup
* Load playground from a trace

* Add theme toggling for JSON editor and move schema to a tab next to the editor (#245)

* Give topP the slider treatment (#264)

It's only right, now that we've done temp. :-)

* [UI] Show flow name in tree (#266)

* [UI] Show span state in details pane (#268)

* [UI] Flows table style improvements (#269)

* [UI] Small flow details page improvements for narrow screens (#273)

* Add CustomOptions (#276)

Also, add stop sequences to the request.

* [UI]Remove sample calls for unsupported actions. Small fixes in flow runner. (#275)

* Create Message sub component for ModelPlayground (#271)

#148

* Fix error with model not accepting request_format (#279)

* Disable the minimap on the monaco editor (#286)

* [UI] Add zero state for flows list page (#291)

* [UI] Fix ng error in flow runner (#297)

* [UI] Hide stream response checkbox for durable flows (#299)

* Integrating the Message component into the Prompt Playground

* Switch model select from native to mat-select (#306)

* Ability to show errors on actions page (#307)

* [UI] Revamp Actions list UI (#308)

* [UI] Remove unnecessary return (#309)

* [UI] Prevent selecting action if no param is set (#310)

* Enable support for multiple messages coming from traceId (#314)

* Avoid making flow runner editors read only (#321)

* [UI] Add filtering and expand/collapse all to actions list (#319)

* Fix error where model selection does not update (#323)

* [UI] Fix action search input style (#325)

* [UI] Update action list name and key display (#328)

* User error callout component on model playground (#330)

* refactor the code around checking for json output support (#304)

* Render images in chat (#340)

* Functioning add and remove button (#335)

* Refactor criteria/validation logic out of playground component (#339)

* [UI] Flow runner UI polish + improvements (#343)

* Move JSON editor to shared components since retriever playground also needs it (#344)

* [UI] Small handful of UI nit fixes (#345)

* [UI] Add loading state to flows table (#349)

* Do not load output from trace; typically we're interested in loading up the inputs, and re-running to get the output (#347)

* Make response_format optional (#350)

* [UI] Add Genkit icon (#371)

* Reset streamed chunks when rerunning the streamed flow (#379)

* [UI] Add tooltips to span state icons (#351)

* Prefer includes over contains (#376)

Contains causes a `TypeError: _i.contains is not a function` when running evals.

* [UI] Add inspect flow state button if flow errors (#382)

* Chat mode (#391)

* Ability to open Flow runner from the trace view (#394)

* Add basics of the eval runner page (#367)

* initial ui changes

* formatted

* Add mocked evals page

* Unnest runs

* Remove evaluations tab from appbar

* [UI] Fix flow details sidebar colors in dark mode (#399)

* [UI] Revamp model playground to chat-based layout (#397)

* [UI] Flow runner: Add a callout for no output so we dont show empty response boxes (#403)

* [UI] Add trace details view (#405)

* role:system message allowed for models (#402)

* Adds support for image models. (#426)

* fix playground runner after runAction change (#429)

* Revert "fix playground runner after runAction change (#429)" (#431)

This reverts commit 82264c0777dd47b0835dda01362a902298ec044b.

* Small tweaks to model playground to reduce chat (#438)

input clutter

* [UI] Update `stackTraceSpans` to filter out internal spans (#439)

* [UI] Add traces table to inspect index page (#448)

* Adding traces to Messages (#432)

* [UI] Update routing for inspect pages (#449)

* [UI] Update routing for run pages (#450)

* [UI] Fix trace display name in table (#451)

* Allow size to be optional (#452)

Model returns error otherwise: 400 None is not of type 'string' - 'size'

* [UI] Fix trace deep links in model playground (#453)

* [UI] Add raw mat-table for evals view (#430)

* initial ui changes

* formatted

* Add mocked evals page

* Add mocked table prelim

* tests

* Use EvalResult for now

* feedback changes

* Add embeddings models (#303)

* [UI] Update /evaluations route to /evaluate (#454)

Matches other verb-based top-level routes.

* [UI] Make all run buttons consistent in playgrounds (#455)

* [UI] Add cmd/ctrl + enter shortcut to playground editors (#456)

* [UI] Add landing state for Run page (#465)

* [UI] Prevent mat-slider from shrinking (#473)

* [UI] Adjust element widths for narrow browsers (#474)

* [UI] Prevent welcome page flicker on action refresh (#475)

* Add tab for Auth input to Flow Runner action (#467)

* [UI] Add JSON sample to flow runner (#479)

* Generic action runner (#484)

* [UI] Add support for tool primitive on dev UI run page (#488)

* [UI] Tighten up spacing of actions list items (#489)

* [UI] Trigger change detection on flow runner response (#486)

* [UI] Add cmd/ctrl + enter shortcut to model playground (#485)

* [UI] Update eval results UI to use expandable cards for results (#491)

* [UI] Prevent scrolling past last line in monaco editor (#495)

* [UI] Use helper class to style pre stacktrace in callout (#502)

* [UI]Evals UI: Update inputs to use a table format (#496)

* [UI] Model playground message styling polish (#515)

* [UI] Fix json editor to ignore initial value if no schema (#517)

* [UI] Set retriever name in playground header (#518)

* [UI] Prevent JSON sample pre-fill if unnecessary (#520)

* Remove fdescribe in tests (#532)

* Fix minor UI elements in eval page (#533)

* WIP Eval UI changes

* Clean scss

* simplify name getter

* trigger checks again

* undo

* Add inspect trace option (#540)

* WIP Eval UI changes

* Clean scss

* WIP add inspect button

* Add inspect button

* Add inspect button

* remove target

* Use links instead of button

* remove unused dep

* Add inspect tab in the Dev UI (#546)

* WIP Eval UI changes

* Clean scss

* WIP add inspect button

* Add inspect button

* Add inspect button

* remove target

* Use links instead of button

* remove unused dep

* Add evaluation tab

* Update messaging

* hide inspect button if no traces (#548)

* [UI] Add typewriter effect to welcoem message (#554)

- Also include missing Google Sans fonts

* [UI] Tweak logo kerning (#555)

* [UI] UI polish for evaluate page (#553)

* [UI] Fix issue in action runner JSON pre-fill (#559)

* [UI] Update typewriter animation to move left-to-right (#560)

* [UI] Show custom metadata attributes last in span details (#563)

- Also move span duration logic to shared util function and show seconds if > 1000ms.

* [UI] Polish for eval result details pane (#564)

* Add support for text-embeddings (#538)

* [UI] Update default font to Google Sans (#565)

* [UI] Update span attributes styling (#568)

* [UI] Update border radius globally (#573)

* [UI] Clip model playground message loading bar to card radius (#576)

* [UI] Prevent shrinkage of breadcrumb chevron (#577)

* [UI] Upgrade angular deps to ^17.3.1 (#587)

* [UI] Add logo lockup to app bar (#588)

* [UI] Fix table not rendering for errored traces (#607)

* [UI] Render base64-encoded images in span output (#606)

* [UI] Update label of expand text button (#608)

* [UI] Update lockup with new svg asset (#623)

* [Eval bugbash] Update tooltip to definitions, visible on entire chip (#624)

* Update tooltip to definitions, visible on entire chip

* typos

* [Eval bugbash]  Show errors as errors in eval UI (#626)

* Update tooltip to definitions, visible on entire chip

* typos

* Mark errors as errors

* use ngIf

* Add TODO

* [Eval bugbash] Only show icon if failed evaluator (#635)

* Update tooltip to definitions, visible on entire chip

* typos

* WIP icons

* Remove unused

* [UI] Fix trace timing display now that they are millis (#638)

* [UI] Fix JSON editor to show up for optional inputs as well (#613)

* Add trace id to model playground when error occurs (#631)

* Display context strings separately instead of a big array (#658)

* [UI]: Update date format to medium (#659)

* Update error tooltip (#665)

* Update error tooltip

* typos

* Show error message if available

* [UI] Tighten up kerning on mat tab labels (#680)

* [UI] Allow resizing of .pre-container and json editor (#682)

* [UI] Add tooltips to temperature and top_p controls (#683)

* [UI] Fix JSON sample autofill in retriever playground (#684)

* [UI] Improve model playground param labels and add tooltips (#686)

* [UI] Fix trace status in table (#687)

* [UI] Update model icon to sparks (#688)

* [UI] Add action type to runner page title (#690)

* [UI] Add title and close button to expand text dialog (#691)

* [UI] Remove redundant title from action runner (#692)

* Pass thru options to API (#695)

* Bump ragas to 0.0.6 (#719)

* [UI] Cleanup system prompt styling in model playground (#725)

* Update system/message placeholders (#727)

* Update placeholders

* Update message.component.ts

* Update Eval Error handling (#685)

* Clarifying label on button formerly known as "Open in Playground" (#636)

- Label now says 'Open in flow runner', 'Open in model runner', etc.
  to make it more clear which step will be run.
- Changing to secondary style button to make it look less like
  the action will be run immediately.

* [UI] Fix callout content not stretching to fit width (#757)

* [UI]: Add metrics table in evals results card (#747)

* [UI] Add support for specifying model version in playground (#760)

* [UI] Remove Evaluate tab in top nav bar (#765)

* [UI] Use flask icon for Evaluate tab (#772)

* [UI] Style updates to eval result details (#790)

* [UI] Render eval metric name in error callout consistently (#792)

* [UI] Fix span duration display (#797)

* Show safety errors in the model runner (#800)

* Rename model playground => runner (#803)

* Rename retriever playground => runner (#805)

* [UI] Adjust metrics table to be full-width (#810)

* [UI] Only show eval zero state when loaded (#811)

Prevents a quick distracting flash of the zero state when the page loads.

* [UI] Set All traces as default in Inspect view (#812)

* [UI] ThemeToggleService unit tests (#816)

* [UI] Make spans deep-linkable in trace + flow details views (#819)

* [UI] Update model runner title to use selected model in config (#822)

* [UI] Clear out images from data-rendered upon receiving new input (#840)

* [UI] Hide append mode for models that do not support multiturn (#847)

* [UI] Show banner for unsupported models (#848)

* [UI] Reset scroll position of input/output when switching spans (#852)

* [UI] Hide "Add message" if model does not support multiturn (#853)

* Fix missed version 0.5.0-rc.1 (#858)

* [UI] Fix display of system prompt (#860)

* [UI] Fix tools icon (#862)

* [UI] Prevent stuck browser back when redirecting to first evaluation run (#13)

* [UI] Add missing app text color style (#16)

* [UI] Apply theme to scrollbars (#20)

* [UI] Clarify ID in flows/traces tables (#23)

* [UI] Show flow error in trace details view, if applicable (#28)

* [UI] Fix eval zero state callout spacing (#24)

* Export textEmbedding (#36)

* [UI] Update README doc with up-to-date instructions (#50)

* [UI] Create skeleton prompt runner component (#54)

Will serve as a base for prompt-specific runner features that we will add.

* [UI] Add icon to all view trace buttons (#57)

* [UI] Show template in prompt runner next to input (#58)

* [UI] Use button toggle group for inspect table filter (#56)

* [UI] Update play icon for run/dispatch span states (#60)

* More sensible default model params (#65)

* Always clear message when not in chat mode - otherwise if an error is shown, we'll still see the previous message. (#67)

* [UI] Show raw prompt template in modal (#70)

* Nesting user input in prompt runner (#72)

* [UI] Add support for prompt variants (#74)

* Allow system role for Gemini 1.5 Pro (#85)

Also removes references to OpenAI from UI.

* Create modular component for a multi-modal message (#83)

* Update faithfulness to v0.1.7 (#87)

* Update faithfulness to v0.1.7

* Update METADATA

* [UI] Add prompt variant to query params to support deep-linking (#88)

* [UI] Fix race condition when setting content in monaco (#96)

* [UI] Small visual fix in app nav bar (#98)

* [UI] Fix incorrect height for modal runner header (#101)

* [UI] Update placeholder label for model version select (#100)

* Message list component (#84)

Co-authored-by: Chris Chestnut <[email protected]>
Co-authored-by: Michael Doyle <[email protected]>

* [UI] Fix view evaluation report button to read correct metdata (#119)

* [UI] Save action sidebar expansion state to `localStorage` (#120)

* [UI]: Move model config params to a separate component (#103)

* [UI] Update model runner to use the new model config component (#124)

* [UI] Pull the new defaults for model config into the new config component (#125)

* [UI] Add ability to export prompt file from model runner (#115)

* [UI] Fix model versions not being loaded on initial render (#131)

Fixes google/genkit#130. This is more of a stop-gap fix, going to explore refactoring these components to utilize Angular signals to eliminate this class of error entirely.

* Integrate the new MessageList component into the ModelRunner (#114)

* [UI] Refactor model-config to use signals (#133)

* Create placeholder for system prompt and first user message (#144)

* [UI] Remove oops from model config template (#143)

* Ensure selected model is set when using left nav (#148)

* [UI] Prevent button icons from flex-shrinking (#151)

* Show large multimedia in a modal (#156)

* Enable all image types in model runner (#160)

* Re-enable gemini vision models (#168)

* [UI] Remove system prompt for single-turn models (#169)

* Set a reasonable (but arbitrary) number of media files per message (#172)

* [UI] Remove obsolete MONACO_PATH provider (unused) (#182)

* [UI] Sort eval metrics for consistent/comparable viewing (#209)

Fixes #207.

* change action latency name (#200)

Change the name of the action latency histogram from
"genkit.action.action_latency" to "genkit.action.latency"
to avoid stutter.

* Add vertex evaluator information to docs

* Address PR comments

Co-authored-by: Kevin Cheung <[email protected]>

---------

Co-authored-by: Michael Doyle <[email protected]>
Co-authored-by: Anthony Barone <[email protected]>
Co-authored-by: MaesterChestnut <[email protected]>
Co-authored-by: shrutip90 <[email protected]>
Co-authored-by: Pavel Jbanov <[email protected]>
Co-authored-by: Anthony Barone <[email protected]>
Co-authored-by: huangjeff5 <[email protected]>
Co-authored-by: ssbushi <[email protected]>
Co-authored-by: Michael Bleigh <[email protected]>
Co-authored-by: Max Lord <[email protected]>
Co-authored-by: Michael Doyle <[email protected]>
Co-authored-by: Chris Chestnut <[email protected]>
Co-authored-by: Jonathan Amsterdam <[email protected]>
Co-authored-by: Kevin Cheung <[email protected]>

Loading branch information

15 people authored May 2, 2024

1 parent 9a9c1f3 commit ca23579

docs/plugins/vertex-ai.md

-Original file line number
+Diff line change
@@ Expand Up @@
     - Imagen2 image generation
     - Gecko text embedding generation
+    It also provides access to subset of evaluation metrics through the Vertex AI [Rapid Evaluation API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/evaluation).
+    - [Safety](https://cloud.google.com/vertex-ai/docs/reference/rest/v1beta1/projects.locations/evaluateInstances#safetyinput)
+    - [Groundeness](https://cloud.google.com/vertex-ai/docs/reference/rest/v1beta1/projects.locations/evaluateInstances#groundednessinput)
+    - [ROUGE](https://cloud.google.com/vertex-ai/docs/reference/rest/v1beta1/projects.locations/evaluateInstances#rougeinput)
+    - [BLEU](https://cloud.google.com/vertex-ai/docs/reference/rest/v1beta1/projects.locations/evaluateInstances#bleuinput)
     ## Installation
     ```posix-terminal
@@ Expand Down Expand Up / @@ -66,6 +73,8 @@ credentials. @@
     ## Usage
+    ### Generative AI Models
     This plugin statically exports references to its supported generative AI models:
     ```js
@@ Expand Down Expand Up / @@ -115,7 +124,7 @@ const embedding = await embed({ @@
     });
     ```
-    ### Anthropic Claude 3 on Vertex AI Model Garden
+    #### Anthropic Claude 3 on Vertex AI Model Garden
     If you have access to Claude 3 models ([haiku](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-haiku), [sonnet](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-sonnet) or [opus](https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-opus)) in Vertex AI Model Garden you can use them with Genkit.
@@ Expand Down Expand Up / @@ -147,3 +156,36 @@ const llmResponse = await generate({ @@
       prompt: 'What should I do when I visit Melbourne?',
     });
     ```
+    ### Evaluators
+    To use the evaluators from Vertex AI Rapid Evaluation, add an `evaluation` block to your `vertexAI` plugin configuration.
+    ```js
+    import { vertexAI, VertexAIEvaluationMetricType } from '@genkit-ai/vertexai';
+    export default configureGenkit({
+      plugins: [
+        vertexAI({
+          projectId: 'your-cloud-project',
+          location: 'us-central1',
+          evaluation: {
+            metrics: [
+              VertexAIEvaluationMetricType.SAFETY,
+              {
+                type: VertexAIEvaluationMetricType.ROUGE,
+                metricSpec: {
+                  rougeType: 'rougeLsum',
+                },
+              },
+            ],
+          },
+        }),
+      ],
+      // ...
+    });
+    ```
+    The configuration above adds evaluators for the `Safety` and `ROUGE` metrics. The example shows two approaches- the `Safety` metric uses the default specification, whereas the `ROUGE` metric provides a customized specification that sets the rouge type to `rougeLsum`.
+    Both evaluators can be run using the `genkit eval:run` command with a compatible dataset: that is, a dataset with `output` and `reference` fields. The `Safety` evaluator can also be run using the `genkit eval:flow -e vertexai/safety` command since it only requires an `output`.

js/plugins/vertexai/src/evaluation.ts

-Original file line number
+Diff line change
@@ Expand Up / @@ -25,6 +25,7 @@ import { EvaluatorFactory } from './evaluator_factory'; @@
      * https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/evaluation#parameter-list
      */
     export enum VertexAIEvaluationMetricType {
+      // Update genkit/docs/plugins/vertex-ai.md when modifying the list of enums
       SAFETY = 'SAFETY',
       GROUNDEDNESS = 'GROUNDEDNESS',
       BLEU = 'BLEU',
@@ Expand Down @@

0 comments on commit `ca23579`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `ca23579`

Commit

There are no files selected for viewing

0 comments on commit ca23579

0 comments on commit `ca23579`