Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature method run multi prompts #33

Merged
merged 8 commits into from
Feb 1, 2024

Conversation

anujsinha3
Copy link
Collaborator

@anujsinha3 anujsinha3 commented Jan 30, 2024

Change Description

Solution Description

created a command-line typer method named eval_on_prompts_file() that:

  • take a list of prompts (as a dictionary )as input
  • for each prompt - runs eval_prompt() for data_file
  • for each prompt - log the results and parameters.
  • logs the performance (evaluation metrics) for each run

command line syntax example:

autodoc eval-on-prompts-file data/autora/data.jsonl data/autora/prompts/all_prompt.json

TODO: add tests

Code Quality

  • I have read the Contribution Guide
  • My code follows the code style of this project
  • My code builds (or compiles) cleanly without any errors or warnings
  • My code contains relevant comments and necessary documentation

Project-Specific Pull Request Checklists

Bug Fix Checklist

  • My fix includes a new test that breaks as a result of the bug (if possible)
  • My change includes a breaking change
    • My change includes backwards compatibility and deprecation warnings (if possible)

New Feature Checklist

  • I have added or updated the docstrings associated with my feature using the NumPy docstring format
  • I have updated the tutorial to highlight my new feature (if appropriate)
  • I have added unit/End-to-End (E2E) test cases to cover my new feature
  • My change includes a breaking change
    • My change includes backwards compatibility and deprecation warnings (if possible)

Documentation Change Checklist

Build/CI Change Checklist

  • If required or optional dependencies have changed (including version numbers), I have updated the README to reflect this
  • If this is a new CI setup, I have added the associated badge to the README

Other Change Checklist

  • Any new or updated docstrings use the NumPy docstring format.
  • I have updated the tutorial to highlight my new feature (if appropriate)
  • I have added unit/End-to-End (E2E) test cases to cover any changes
  • My change includes a breaking change
    • My change includes backwards compatibility and deprecation warnings (if possible)

@codecov-commenter
Copy link

codecov-commenter commented Jan 30, 2024

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (3c7e0a0) 97.17% compared to head (667e77e) 97.32%.
Report is 1 commits behind head on main.

Files Patch % Lines
src/autora/doc/pipelines/main.py 96.42% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #33      +/-   ##
==========================================
+ Coverage   97.17%   97.32%   +0.14%     
==========================================
  Files           3        5       +2     
  Lines         177      224      +47     
==========================================
+ Hits          172      218      +46     
- Misses          5        6       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@anujsinha3 anujsinha3 requested a review from carlosgjs January 30, 2024 11:29
Copy link
Collaborator

@carlosgjs carlosgjs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! A few minor changes. Also it looks like it's missing unit tests.

@@ -47,6 +48,44 @@ def evaluate_documentation(predictions: List[str], references: List[str]) -> Tup
return (bleu, meteor)


@app.command(help="Evaluate a model for code-to-documentation generation for all prompts in the prompts_file")
def eval_on_prompts_file(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The eval-on-prompts-file sounds a little verbose for the CLI. What do you think about something shorter, like eval_prompts?

Copy link
Collaborator Author

@anujsinha3 anujsinha3 Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it makes sense. Have implemented the change.

src/autora/doc/pipelines/main.py Show resolved Hide resolved
src/autora/doc/pipelines/main.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@carlosgjs carlosgjs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just one final adjustment. Thx!

@@ -49,7 +49,7 @@ def evaluate_documentation(predictions: List[str], references: List[str]) -> Tup


@app.command(help="Evaluate a model for code-to-documentation generation for all prompts in the prompts_file")
def eval_on_prompts_file(
def eval_prompts(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to ask you to add a doc-comment for this function. In particular because it's hard to tell what the List[Dict[str,str]] will contain. But I think a better option is to create a type (a dataclass?) for the return type, e.g. an EvalResult class.

def get_eval_result_from_prediction(
prediction: Tuple[List[str], float, float], prompt: str
) -> Dict[str, Any]:
eval_result = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above, would be good to make this strongly typed

@anujsinha3 anujsinha3 marked this pull request as draft January 31, 2024 23:00
@anujsinha3 anujsinha3 marked this pull request as ready for review February 1, 2024 00:55
@anujsinha3 anujsinha3 merged commit e7c86f5 into main Feb 1, 2024
9 checks passed
@anujsinha3 anujsinha3 deleted the feature-method-run-multi-prompts branch February 1, 2024 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants