-
Notifications
You must be signed in to change notification settings - Fork 336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(traces): evaluation annotations on traces for associating spans with eval metrics #1693
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
mikeldking
force-pushed
the
1691-eval-annotations-in-dataset
branch
from
November 2, 2023 05:03
eda9807
to
ea0b642
Compare
mikeldking
changed the title
feat(traces): evaluation annotations on traces for associating spans with evaluations
feat(traces): evaluation annotations on traces for associating spans with eval metrics
Nov 2, 2023
mikeldking
commented
Nov 2, 2023
axiomofjoy
reviewed
Nov 10, 2023
Looking good so far. |
axiomofjoy
reviewed
Nov 10, 2023
I like the name "run" for the output from an evaluation rather than "dataset". It makes it more clear that there can be multiple. |
RogerHYang
reviewed
Nov 30, 2023
RogerHYang
approved these changes
Dec 1, 2023
Co-authored-by: Xander Song <[email protected]>
Co-authored-by: Xander Song <[email protected]>
mikeldking
force-pushed
the
1691-eval-annotations-in-dataset
branch
from
December 1, 2023 20:31
c6449a4
to
0bd2737
Compare
RogerHYang
reviewed
Dec 1, 2023
RogerHYang
reviewed
Dec 1, 2023
mikeldking
force-pushed
the
1691-eval-annotations-in-dataset
branch
from
December 1, 2023 20:58
1a84535
to
7ea193b
Compare
mikeldking
added a commit
that referenced
this pull request
Dec 1, 2023
…with eval metrics (#1693) * feat: initial associations of evaluations to traces * add some documentaiton * wip: add dataframe utils * Switch to a single evaluation per dataframe * make copy the default * fix doc string * fix name * fix notebook * Add immutability * remove value from being required * fix tutorials formatting * make type a string to see if it fixes tests * fix test to handle un-parsable * Update src/phoenix/trace/trace_eval_dataset.py Co-authored-by: Xander Song <[email protected]> * Update src/phoenix/trace/trace_eval_dataset.py Co-authored-by: Xander Song <[email protected]> * change to trace_evaluations * cleanup * Fix formatting * pr comments * cleanup notebook * make sure columns are dropped * remove unused test --------- Co-authored-by: Xander Song <[email protected]>
mikeldking
added a commit
that referenced
this pull request
Dec 1, 2023
…with eval metrics (#1693) * feat: initial associations of evaluations to traces * add some documentaiton * wip: add dataframe utils * Switch to a single evaluation per dataframe * make copy the default * fix doc string * fix name * fix notebook * Add immutability * remove value from being required * fix tutorials formatting * make type a string to see if it fixes tests * fix test to handle un-parsable * Update src/phoenix/trace/trace_eval_dataset.py Co-authored-by: Xander Song <[email protected]> * Update src/phoenix/trace/trace_eval_dataset.py Co-authored-by: Xander Song <[email protected]> * change to trace_evaluations * cleanup * Fix formatting * pr comments * cleanup notebook * make sure columns are dropped * remove unused test --------- Co-authored-by: Xander Song <[email protected]>
mikeldking
added a commit
that referenced
this pull request
Dec 4, 2023
* fix: trace dataset to disc * feat(traces): evaluation annotations on traces for associating spans with eval metrics (#1693) * feat: initial associations of evaluations to traces * add some documentaiton * wip: add dataframe utils * Switch to a single evaluation per dataframe * make copy the default * fix doc string * fix name * fix notebook * Add immutability * remove value from being required * fix tutorials formatting * make type a string to see if it fixes tests * fix test to handle un-parsable * Update src/phoenix/trace/trace_eval_dataset.py Co-authored-by: Xander Song <[email protected]> * Update src/phoenix/trace/trace_eval_dataset.py Co-authored-by: Xander Song <[email protected]> * change to trace_evaluations * cleanup * Fix formatting * pr comments * cleanup notebook * make sure columns are dropped * remove unused test --------- Co-authored-by: Xander Song <[email protected]> * delete the metadata * optemize removal of metadata * shallow copy of dataframe --------- Co-authored-by: Xander Song <[email protected]>
jlopatec
pushed a commit
to jlopatec/phoenix
that referenced
this pull request
Dec 4, 2023
* fix: trace dataset to disc * feat(traces): evaluation annotations on traces for associating spans with eval metrics (Arize-ai#1693) * feat: initial associations of evaluations to traces * add some documentaiton * wip: add dataframe utils * Switch to a single evaluation per dataframe * make copy the default * fix doc string * fix name * fix notebook * Add immutability * remove value from being required * fix tutorials formatting * make type a string to see if it fixes tests * fix test to handle un-parsable * Update src/phoenix/trace/trace_eval_dataset.py Co-authored-by: Xander Song <[email protected]> * Update src/phoenix/trace/trace_eval_dataset.py Co-authored-by: Xander Song <[email protected]> * change to trace_evaluations * cleanup * Fix formatting * pr comments * cleanup notebook * make sure columns are dropped * remove unused test --------- Co-authored-by: Xander Song <[email protected]> * delete the metadata * optemize removal of metadata * shallow copy of dataframe --------- Co-authored-by: Xander Song <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
resolves #1691
This adds the ability to associate eval results to trace datasets. It notably keeps track of a single eval run in a new
TraceEvaluations
- which contains the eval results as well as the name of the eval.Considerations
SpanEvaluations
contain information about oneeval
- this gives room for there to be meta-data to be associated with the eval such as model used, etc.evaluations
on theTraceDataset
be a list - this is to make it easy to append to and allows for multiple evaluations to be run (including duplicate). While this would cause "squashing" of evals, it doesn't cause data loss in any true sense so is a bit more future proof than say a dict