17 Dec 03:30

kmaphoenix

49788d8

v1.13.1 Latest

Latest

Bug Fix

Fix type in environment client usage in Sessions class

Full Changelog: 1.13.0...1.13.1

Assets 2

17 Dec 03:17

kmaphoenix

1.13.0

19fe9e4

v1.13.0

Breaking Changes

Note that 2 methods from the Sessions class have been deprecated.

Sessions.preset_parameters
Sessions.run_conversation

For each of these methods, you can use Sessions.detect_intent instead, which is fully backwards compatible.

New Features

Agent Tasks Generator

This is a specialized tool that allows a user to evaluate any arbitrary agent to determine what the Agent is capable of accomplishing from a task perspective.
This is a pre-release feature that will be accompanied by more automated testing features in the future.

For now, you can use this as a way to analyze arbitrary Agents to see if they are set up to perform the tasks you believe you configured them to do.

from dfcx_scrapi.tools.agent_task_generator import AgentTaskGenerator

atg = AgentTaskGenerator(agent_id=agent_id)
atg.get_agent_tasks()

Output

{'tasks': [{'name': 'Greeting and Intent Understanding',
   'description': 'The agent greets the user and attempts to understand their intent. It can provide basic information like translations or virtual money, and direct the user to appropriate tools or flows based on their request.'},
  {'name': 'Product and Company Information Retrieval',
   'description': 'The agent can access a data store containing information from the YETI website to answer user queries about YETI products and the company.'},
  {'name': 'Trip Planning Assistance',
   'description': 'The agent can collect basic information from the user to assist with trip planning, including destination, travel dates, and preferences. It can then pass this information to a separate flow for further processing.'}]}

Evaluation Dataset from Conversation History

You can now quickly create an Evaluation dataset format using pre-selected conversations from the Conversation History in your Agent.
Simply select the list of conversation_ids that you want, and pass that to the Evals.create_dataset_from_conv_ids method, which will provide a Pandas Dataframe in return.

This can be saved as a CSV, Google Sheet, or used locally to run Evals on your Agent.

from dfcx_scrapi.core.conversation_history import ConversationHistory
from dfcx_scrapi.tools.evaluations import Evaluations

ch = ConversationHistory()
evals = Evaluations(agent_id=agent_id)

all_convos = ch.list_conversations(agent_id)
convo_ids = [convo.name for convo in all_convos[:5]]
evals.create_dataset_from_conv_ids(convo_ids)

Output

eval_id	action_id	action_type	action_input	action_input_parameters	tool_action	notes
0	001	1	User Utterance	what items do you have for dogs?
1	001	2	Tool Invocation	yeti-website	{'requestBody': {'query': 'what items do you h...	yeti-website
2	001	3	Agent Response	YETI offers dog bowls and dog beds. The Boomer...
3	002	1	User Utterance	who is the ceo?
4	002	2	Tool Invocation	yeti-website	{'requestBody': {'query': 'who is the ceo?'}}	yeti-website
5	002	3	Agent Response	The CEO of YETI is Matt Reintjes.
6	003	1	User Utterance	I want to speak to an operator
7	003	2	Agent Response	Just a moment while I connect you...
8	004	1	User Utterance	where is yeti hq at?
9	004	2	Tool Invocation	yeti-website	{'requestBody': {'query': 'where is yeti hq at...	yeti-website
10	004	3	Agent Response	YETI's headquarters is located in Austin, Texa...
11	005	1	User Utterance	what is the smallest cup I can buy?
12	005	2	Tool Invocation	yeti-website	{'requestBody': {'query': 'what is the smalles...	yeti-website
13	005	3	Agent Response	The smallest cup you can buy is the 4oz cup. I...

CICD Workflow Example

We've added an example CICD workflow for anyone that is curious to see how a CICD workflow could be set up using SCRAPI.
Fair warning, it's very involved! 😄
However, it provides some good pointers on how you can set up these types of complex pipelines using this library.

Enhancements

Added support for environment_id when calling Sessions.build_session_id, which allows you to now use Sessions.detect_intent with a session_id that includes an Environment
Added language_code support throughout Evaluations class
Added support for setting BigQuery logging and interaction settings
Added support for new lint rules in ruff
Updated some out of date Example notebooks and fixed broken links
Added support for Session Parameters in Evaluations

Bug Fix

Fixed several issues in Evaluations where dataframe prep and parsing where failing

What's Changed

Feat/agent tasks generator by @kmaphoenix in #259
feat: Implement Python requests-based get/set for BigQuery Interaction Logging Settings by @justin-oos in #258
Feat/bq update sdk by @kmaphoenix in #263
Fill Golden Template from Conv_Ids by @gmchueh in #261
Fix/update lint rules by @kmaphoenix in #264
Clean example notebooks (bot builder / vertex ai / google sheets) by @ethanknights in #256
Adding session parameters to evaluations function by @AAMEHROTRA1230 in #262
Feature/dfcxcicd by @sridharvikram in #239
fix: add lang_code support for DataLoader by @kmaphoenix in #265
fix/evaluations multiple tool pairing and empty utterance pairing by @gmchueh in #267
Update google doc reference in nlu_evaluation_testing.ipynb by @ethanknights in #266
Fix/session id with env by @kmaphoenix in #271

New Contributors

@justin-oos made their first contribution in #258
@gmchueh made their first contribution in #261
@ethanknights made their first contribution in #256
@AAMEHROTRA1230 made their first contribution in #262
@sridharvikram made their first contribution in #239

Full Changelog: 1.12.5...1.13.0

Contributors

kmaphoenix, gmchueh, and 4 other contributors

Assets 2

29 Oct 16:58

kmaphoenix

1.12.5

3189a7c

v1.12.5

What's Changed

Fix/lang code nlu evals by @kmaphoenix in #255

Full Changelog: 1.12.4...1.12.5

Contributors

kmaphoenix

Assets 2

10 Oct 03:50

kmaphoenix

1.12.4

dfdf230

v1.12.4

What's Changed

Fix/tool call quality bug by @kmaphoenix in #253

Full Changelog: 1.12.3...1.12.4

Contributors

kmaphoenix

Assets 2

09 Oct 22:06

kmaphoenix

1.12.3

e25ba32

v1.12.3

What's Changed

Feature/conversation rebase by @kmaphoenix in #240
Fix/default creds inheritance by @kmaphoenix in #246
Prevent IndexError in collect_playbook_responses when not in playbook by @SeanScripts in #244
feat: add support for flow invoke; clean up creds passing in evals by @kmaphoenix in #248
Fix/optional tool call metrics evals by @kmaphoenix in #250
Fix/support lang code conversation by @kmaphoenix in #251

Full Changelog: 1.12.2...1.12.3

Contributors

kmaphoenix and SeanScripts

Assets 2

27 Sep 19:56

kmaphoenix

1.12.2

a79f08d

v1.12.2

Enhancements

Added support for language_code on all applicable methods in Flows class
Added support for parameters when using the Datastore Evaluations class and notebook
Added support for Playbook Versions
New notebook to check status of datastores and search for Datastore IDs, Doc IDs, and URLs
Added helper methods for Search to make listing urls / doc ids / documents much easier users

Bug Fix

Fixed bug in CopyUtil class that was causing the create_entity_type method to fail
Fixed a bug in Dataframe Functions which was causing scopes to not be inherited properly
Fixed new Vertex Agents Evals notebook links for Github and GCP workbench launching to point to correct location

What's Changed

fix: add support for language_code on applicable methods by @kmaphoenix in #222
fix: update copy_util to resolve bug issue 192 by @my3sons in #205
Feat/parameter support datastore evals by @kmaphoenix in #225
feat: add support for playbook versions by @kmaphoenix in #226
Fix/scopes dataframe functions by @kmaphoenix in #228
Update vertex_agents_evals.ipynb by @YuncongZhou in #231
Feature/datastoreindexurls by @agutta in #235
Feat/add vais search methods by @kmaphoenix in #237
chore: update notebook to use latest scrapi code by @kmaphoenix in #238

New Contributors

@my3sons made their first contribution in #205
@YuncongZhou made their first contribution in #231
@agutta made their first contribution in #235

Full Changelog: 1.12.1...1.12.2

Contributors

my3sons, YuncongZhou, and 2 other contributors

Assets 2

23 Aug 11:21

kmaphoenix

1.12.1

80c7a5b

v1.12.1

Bug

Patch to require google-cloud-aiplatform as part of the setuptools
The lack of google-cloud-aiplatform in setuptools was causing import errors in some classes that rely on vertexai as an import

Full Changelog: 1.12.0...1.12.1

Assets 2

21 Aug 15:05

kmaphoenix

1.12.0

fe10510

v1.12.0

New Features

Evaluations are here! 🎉

What are Evaluations? 📐 📈

We know that building an Agent is only part of the journey.
Understanding how that Agent responds to real-world queries is a key indicator of how it will perform in Production.
Running evaluations, or "evals", allows Agent developers to quickly identify "losses", or areas of opportunities for improving Agent design.

Evals can provide answers to questions like:

What is the current performance baseline for my Agent?
How is my Agent performing after the most recent changes?
If I switch to a new LLM, how does that change my Agent's performance?

Evaluation Toolsets in SCRAPI 🛠️🐍

For this latest release, we have included 2 specific Eval setups for developers to use with Agent Builder and Dialogflow CX Agents.

These are offered as two distinct evaluations toolsets because of a few reasons:

They support different build architectures in DFCX vs. Agent Builder
They support different metrics based on the task you are trying to evaluate
They support different tool calling setups: Native DataStores vs. arbitrary custom tools

Metrics by Toolset. 📏

The following metrics are currently supported for each toolset.
Additional metrics will be added over time to support various other evaluation needs.

DataStore Evaluations
- Url Match
- Context Recall
- Faithfulness
- Answer Correctness
- RougeL
Multi-Turn, Multi-Agent w/ Tool Callling Evaluations
- Semantic Similarity
- Exact Match Tool Quality

Getting Started with Evaluations 🏁

Start by choosing your Eval toolset based on the Agent architecture you are evaluating
- DataStore Evaluations
- Multi-turn, Multi-Agent w/ Tool Calling Evaluations
Build an Evaluation Dataset. You can find detailed information about the dataset formats in each of the toolset instructions
Run your evals!

Example Eval Setup for Multi-Turn, Mutli-Agent w/ Tools

import pandas as pd
from dfcx_scrapi.tools.evaluations import Evaluations
from dfcx_scrapi.tools.evaluations import DataLoader

data = DataLoader()

INPUT_SCHEMA_REQUIRED_COLUMNS = ['eval_id', 'action_id', 'action_type', 'action_input', 'action_input_parameters', 'tool_action', 'notes']

sample_df = pd.DataFrame(columns=INPUT_SCHEMA_REQUIRED_COLUMNS)

sample_df.loc[0] = ["travel-ai-001", 1, "User Utterance", "Paris", "", "", ""]
sample_df.loc[1] = ["travel-ai-001", 2, "Playbook Invocation", "Travel Inspiration", "", "", ""]
sample_df.loc[2] = ["travel-ai-001", 3, "Agent Response", "Paris is a beautiful city! Here are a few things you might enjoy doing there:\n\nVisit the Eiffel Tower\nTake a walk along the Champs-Élysées\nVisit the Louvre Museum\nSee the Arc de Triomphe\nTake a boat ride on the Seine River", "", "", ""]

sample_df = data.from_dataframe(sample_df)
agent_id = "projects/your-project/locations/us-central1/agents/11111-2222-33333-44444" # Example Agent
evals = Evaluations(agent_id, metrics=["response_similarity", "tool_call_quality"])
eval_results = evals.run_query_and_eval(sample_df.head(10))

What's Changed

Feat/evaluations by @kmaphoenix in #217
Feat/evals notebook by @kmaphoenix in #218

Full Changelog: 1.11.2...1.12.0

Contributors

kmaphoenix

Assets 2

09 Aug 16:45

kmaphoenix

1.11.2

2b5f675

v1.11.2

What's Changed

fix: improve markdown line handling by @kmaphoenix in #213
Add build_search_engine_proto() to Engines by @rantman in #215

New Contributors

@rantman made their first contribution in #215

Full Changelog: 1.11.1...1.11.2

Contributors

rantman and kmaphoenix

Assets 2

31 Jul 18:45

kmaphoenix

1.11.1

a4eed51

v1.11.1

What's Changed

FR - allow users to pass endUserMetadata as an optional in detect_intent and autoeval colab by @jkshj21 in #210
FR-186 - Export results into multiple mode by @Naveenkm13 in #209
Add creds to the constructors by @MRyderOC in #204
Feat/playbook instructions parsing by @kmaphoenix in #212

New Contributors

@Naveenkm13 made their first contribution in #209

Full Changelog: 1.11.0...1.11.1

Contributors

kmaphoenix, jkshj21, and 2 other contributors

Assets 2

Releases: GoogleCloudPlatform/dfcx-scrapi

v1.13.1

Bug Fix

v1.13.0

Breaking Changes

New Features

Agent Tasks Generator

Evaluation Dataset from Conversation History

CICD Workflow Example

Enhancements

Bug Fix

What's Changed

New Contributors

Contributors

v1.12.5

What's Changed

Contributors

v1.12.4

What's Changed

Contributors

v1.12.3

What's Changed

Contributors

v1.12.2

Enhancements

Bug Fix

What's Changed

New Contributors

Contributors

v1.12.1

Bug

v1.12.0

New Features

Evaluations are here! 🎉

What are Evaluations? 📐 📈

Evaluation Toolsets in SCRAPI 🛠️🐍

Metrics by Toolset. 📏

Getting Started with Evaluations 🏁

Example Eval Setup for Multi-Turn, Mutli-Agent w/ Tools

What's Changed

Contributors

v1.11.2

What's Changed

New Contributors

Contributors

v1.11.1

What's Changed

New Contributors

Contributors