feat(traces): evaluation annotations on traces for associating spans with eval metrics #1693

mikeldking · 2023-11-01T02:15:25Z

resolves #1691

This adds the ability to associate eval results to trace datasets. It notably keeps track of a single eval run in a new TraceEvaluations - which contains the eval results as well as the name of the eval.

Considerations

Made a SpanEvaluations contain information about one eval - this gives room for there to be meta-data to be associated with the eval such as model used, etc.
Made the evaluations on the TraceDataset be a list - this is to make it easy to append to and allows for multiple evaluations to be run (including duplicate). While this would cause "squashing" of evals, it doesn't cause data loss in any true sense so is a bit more future proof than say a dict

review-notebook-app · 2023-11-02T01:41:51Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

src/phoenix/trace/trace_eval_dataset.py

src/phoenix/trace/spans_dataframe_utils.py

src/phoenix/trace/trace_dataset.py

src/phoenix/trace/trace_eval_dataset.py

axiomofjoy · 2023-11-10T21:07:01Z

Looking good so far.

src/phoenix/trace/spans_dataframe_utils.py

axiomofjoy · 2023-11-10T21:56:42Z

I like the name "run" for the output from an evaluation rather than "dataset". It makes it more clear that there can be multiple.

mikeldking · 2023-11-10T23:51:01Z

I like the name "run" for the output from an evaluation rather than "dataset". It makes it more clear that there can be multiple.

Naming is hard - I think on one hand I do like a consistency of language but I hear you - it's a set of EvaluationResults.

src/phoenix/trace/trace_dataset.py

src/phoenix/trace/trace_evaluations.py

src/phoenix/trace/trace_dataset.py

src/phoenix/trace/trace_evaluations.py

Co-authored-by: Xander Song <[email protected]>

src/phoenix/trace/span_evaluations.py

…with eval metrics (#1693) * feat: initial associations of evaluations to traces * add some documentaiton * wip: add dataframe utils * Switch to a single evaluation per dataframe * make copy the default * fix doc string * fix name * fix notebook * Add immutability * remove value from being required * fix tutorials formatting * make type a string to see if it fixes tests * fix test to handle un-parsable * Update src/phoenix/trace/trace_eval_dataset.py Co-authored-by: Xander Song <[email protected]> * Update src/phoenix/trace/trace_eval_dataset.py Co-authored-by: Xander Song <[email protected]> * change to trace_evaluations * cleanup * Fix formatting * pr comments * cleanup notebook * make sure columns are dropped * remove unused test --------- Co-authored-by: Xander Song <[email protected]>

* fix: trace dataset to disc * feat(traces): evaluation annotations on traces for associating spans with eval metrics (#1693) * feat: initial associations of evaluations to traces * add some documentaiton * wip: add dataframe utils * Switch to a single evaluation per dataframe * make copy the default * fix doc string * fix name * fix notebook * Add immutability * remove value from being required * fix tutorials formatting * make type a string to see if it fixes tests * fix test to handle un-parsable * Update src/phoenix/trace/trace_eval_dataset.py Co-authored-by: Xander Song <[email protected]> * Update src/phoenix/trace/trace_eval_dataset.py Co-authored-by: Xander Song <[email protected]> * change to trace_evaluations * cleanup * Fix formatting * pr comments * cleanup notebook * make sure columns are dropped * remove unused test --------- Co-authored-by: Xander Song <[email protected]> * delete the metadata * optemize removal of metadata * shallow copy of dataframe --------- Co-authored-by: Xander Song <[email protected]>

* fix: trace dataset to disc * feat(traces): evaluation annotations on traces for associating spans with eval metrics (Arize-ai#1693) * feat: initial associations of evaluations to traces * add some documentaiton * wip: add dataframe utils * Switch to a single evaluation per dataframe * make copy the default * fix doc string * fix name * fix notebook * Add immutability * remove value from being required * fix tutorials formatting * make type a string to see if it fixes tests * fix test to handle un-parsable * Update src/phoenix/trace/trace_eval_dataset.py Co-authored-by: Xander Song <[email protected]> * Update src/phoenix/trace/trace_eval_dataset.py Co-authored-by: Xander Song <[email protected]> * change to trace_evaluations * cleanup * Fix formatting * pr comments * cleanup notebook * make sure columns are dropped * remove unused test --------- Co-authored-by: Xander Song <[email protected]> * delete the metadata * optemize removal of metadata * shallow copy of dataframe --------- Co-authored-by: Xander Song <[email protected]>

mikeldking force-pushed the 1691-eval-annotations-in-dataset branch from eda9807 to ea0b642 Compare November 2, 2023 05:03

mikeldking changed the title ~~feat(traces): evaluation annotations on traces for associating spans with evaluations~~ feat(traces): evaluation annotations on traces for associating spans with eval metrics Nov 2, 2023

mikeldking marked this pull request as ready for review November 2, 2023 14:45

mikeldking commented Nov 2, 2023

View reviewed changes

src/phoenix/trace/trace_eval_dataset.py Outdated Show resolved Hide resolved

mikeldking requested a review from RogerHYang November 3, 2023 01:58

axiomofjoy reviewed Nov 10, 2023

View reviewed changes

src/phoenix/trace/spans_dataframe_utils.py Outdated Show resolved Hide resolved

mikeldking marked this pull request as draft November 30, 2023 22:58

RogerHYang reviewed Nov 30, 2023

View reviewed changes

src/phoenix/trace/trace_dataset.py Outdated Show resolved Hide resolved

src/phoenix/trace/trace_dataset.py Outdated Show resolved Hide resolved

src/phoenix/trace/trace_dataset.py Outdated Show resolved Hide resolved

mikeldking marked this pull request as ready for review December 1, 2023 03:56

mikeldking requested review from axiomofjoy and RogerHYang December 1, 2023 04:05

RogerHYang approved these changes Dec 1, 2023

View reviewed changes

mikeldking added 13 commits December 1, 2023 13:30

feat: initial associations of evaluations to traces

3c19deb

add some documentaiton

f757f55

wip: add dataframe utils

97d3afd

Switch to a single evaluation per dataframe

1b32841

make copy the default

8e9a62b

fix doc string

1b3ad62

fix name

f93ebf5

fix notebook

4e51ee8

Add immutability

8eb5bef

remove value from being required

3056f4d

fix tutorials formatting

fde1605

make type a string to see if it fixes tests

415960e

fix test to handle un-parsable

e112792

mikeldking and others added 6 commits December 1, 2023 13:30

Update src/phoenix/trace/trace_eval_dataset.py

f8b73d3

Co-authored-by: Xander Song <[email protected]>

Update src/phoenix/trace/trace_eval_dataset.py

ff12281

Co-authored-by: Xander Song <[email protected]>

change to trace_evaluations

f98f810

cleanup

c442bc0

Fix formatting

65e1e25

pr comments

0bd2737

mikeldking force-pushed the 1691-eval-annotations-in-dataset branch from c6449a4 to 0bd2737 Compare December 1, 2023 20:31

RogerHYang reviewed Dec 1, 2023

View reviewed changes

src/phoenix/trace/span_evaluations.py Outdated Show resolved Hide resolved

RogerHYang reviewed Dec 1, 2023

View reviewed changes

src/phoenix/trace/span_evaluations.py Outdated Show resolved Hide resolved

mikeldking added 2 commits December 1, 2023 13:38

cleanup notebook

7a76d7f

make sure columns are dropped

7ea193b

mikeldking force-pushed the 1691-eval-annotations-in-dataset branch from 1a84535 to 7ea193b Compare December 1, 2023 20:58

remove unused test

a0f4d5d

mikeldking merged commit a218a65 into main Dec 1, 2023
10 checks passed

mikeldking deleted the 1691-eval-annotations-in-dataset branch December 1, 2023 21:21

github-actions bot mentioned this pull request Dec 1, 2023

chore(main): release 1.5.0 #1847

Merged

github-actions bot mentioned this pull request Feb 16, 2024

chore(main): release phoenix 4.0.0 #2321

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(traces): evaluation annotations on traces for associating spans with eval metrics #1693

feat(traces): evaluation annotations on traces for associating spans with eval metrics #1693

mikeldking commented Nov 1, 2023 •

edited

Loading

review-notebook-app bot commented Nov 2, 2023

axiomofjoy commented Nov 10, 2023

axiomofjoy commented Nov 10, 2023

mikeldking commented Nov 10, 2023

feat(traces): evaluation annotations on traces for associating spans with eval metrics #1693

feat(traces): evaluation annotations on traces for associating spans with eval metrics #1693

Conversation

mikeldking commented Nov 1, 2023 • edited Loading

Considerations

review-notebook-app bot commented Nov 2, 2023

axiomofjoy commented Nov 10, 2023

axiomofjoy commented Nov 10, 2023

mikeldking commented Nov 10, 2023

mikeldking commented Nov 1, 2023 •

edited

Loading