AAPB Evaluations

This repository contains the evaluation codebase, results, and reports for the AAPB-CLAMS collaboration project. Evaluations are done on CLAMS Apps or on a pipeline/group of CLAMS Apps that give an evaluable result on a certain task for video metadata extraction.

Structure of This Repository

Each subdirectory of the repository is an evaluation task within the project. Each has its own files as above.

Filename Conventions

Inputs to Evaluations

golds - The gold standard, human-annotated files by which apps are evaluated for predictive ability.
- synomymous "ref", "reference", "groundtruth", or "goldstandard"
- often are .tsv or .csv or .txt.
predictions - The app-predicted files with predicted-annotations of what phenomena are to be evaluated. (e.g. time durations for slate detection.)
- synomymous "pred", "test", "system", or "output"
- each preds directory represents a batch, with naming conventions as follows:preds@<APP_NAME><APP_VER>@<BATCH_NAME>
- are always .mmif files with app views.

Outputs to Evaluations

results - This should be the result system output of the evaluation numbers from a finished evaluation.
- often results.txt file. This should be renamed according to conventions currently listed here.
- This was used before to describe machine out prediction "results". THIS TERM NO LONGER REFERS TO THIS.
- There might be results per GUID, or it may be a summary.
reports - Reports are more formal documents that describe the results meant for business intelligence.
- Plans are to automate some generation of the report from the results, which may require some automatic scripts. However, some parts of the report must often be manually curated.

See Remaining Work for continued filename convention issues.

Workflow of Evaluations

Important

In the future, evaluations should be invoked in the same manner. Likely through a Docker module, or via a CLI command that is the same.
cd into the appropriate task_eval directory example command: python3 -m evaluate -g url/to/gold/web -p path/to/local/mmifpreds -r results_printout_filename.txt Many of the evaluations should also retrieve the golds automatically by using from clams_utils.aapb import goldretriever and goldretriever.download_golds(<params>). Thus, it is usually not required to provide -g.

Choose evaluation task, create batch with GUIDs.
In AAPB Annotations, create raw annotations, then process.py into golds. Upload those golds via a github commit. (Requires preprocessing and access to videos)
Run app/pipeline-of-apps to create output pred .mmifs locally on your machine. (Also requires access to videos)
Run the evaluation code inputting url-to-golds-commit and path-to-local-mmifs. Obtain result files.
Have python code and hand-generation of nameconvention-report.txt to generate summary of results.

Instructions to Run Apps

CLAMS Apps Manual.
TestDrive Instructions (Alternate).

Remaining Work

The users and use cases of this evaluation workflow remain under discussion. For the moment, the work expected has been converted into issues.

Name		Name	Last commit message	Last commit date
Latest commit History 215 Commits
.github/workflows		.github/workflows
WIP		WIP
asr_eval		asr_eval
fa_eval		fa_eval
nel_eval		nel_eval
ner_eval		ner_eval
ocr_eval		ocr_eval
pointclassification-eval		pointclassification-eval
timeframe-eval		timeframe-eval
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
goldretriever.py		goldretriever.py
template_for_eval_readmes.md		template_for_eval_readmes.md
template_for_eval_reports.md		template_for_eval_reports.md
the_instructions_for_developing_future_eval_tasks.md		the_instructions_for_developing_future_eval_tasks.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AAPB Evaluations

Structure of This Repository

Filename Conventions

Inputs to Evaluations

Outputs to Evaluations

Workflow of Evaluations

Instructions to Run Apps

Remaining Work

About

Releases

Packages

Contributors 17

Languages

License

clamsproject/aapb-evaluations

Folders and files

Latest commit

History

Repository files navigation

AAPB Evaluations

Structure of This Repository

Filename Conventions

Inputs to Evaluations

Outputs to Evaluations

Workflow of Evaluations

Instructions to Run Apps

Remaining Work

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 17

Languages

Packages