Skip to content

Latest commit

 

History

History
34 lines (29 loc) · 1.24 KB

template_for_eval_reports.md

File metadata and controls

34 lines (29 loc) · 1.24 KB

(Project Name) Evaluation Report

(Delete this section when using as a template.)  
Standardizations: 
Use "preds" for machine predictions, and "golds" for ground truth data. Do not use other terms. 
Use permalinks for github commits on each of the links below that track back to the exact App, Golds, Preds, and evaluate.py. 

File Naming Convention: 
TODO
Results File Naming: results.txt should be named so we know what report it goes with. 

Report Instance of Evaluation Information

  • Date/Time in ISO Format
  • [App/Model] (github commit link), version info.
  • [Ground Truth/Golds Dataset] (github commit link).
  • [Prediction Dataset] (github commit link).
  • [Evaluation Script] (github commit link).
  • Run command or instructions that generated the report.

Metrics

Definition of Metrics.
What metrics are used, what do they mean, how are they constructed, which values are better/acceptable, which direction is better.
[metric implementation module] (link).

Results

Side-by-Side Views

The side-by-side comparison of gold and test annotations is visible within the [results output].
Each file consists of the annotations for one video document,a which look like the following:

Limitations/Issues

  • Issue Name -