Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trajectories that raise an error are ignored #13

Open
aleSuglia opened this issue Feb 11, 2022 · 1 comment
Open

Trajectories that raise an error are ignored #13

aleSuglia opened this issue Feb 11, 2022 · 1 comment

Comments

@aleSuglia
Copy link

aleSuglia commented Feb 11, 2022

Hi @aishwaryap,

I was reading about the recent changes to the code reported in #10 and we unfortunately get results that differ substantially from yours. I started dissecting the code to understand what's the reason for such discrepancies in the results. From my understanding of the inference_runner.py script, you spawn several processes, each with a given portion of the tasks. However, I can see that the exception handling logic simply ignores an instance that raises an error: https://github.com/emma-simbot/teach/blob/speaker_tokens/src/teach/inference/inference_runner.py#L130

This is detrimental because if a dataset instance errors for whatever reason, its contribution to the overall metrics is ignored. Instead, the proper way of dealing with this should be to ignore that trajectory and still add to the metrics that you were not successful. Potentially, such faulty trajectories should be reported in the metrics file for future debugging.

Am I missing something?

@aishwaryap
Copy link
Contributor

Hi @aleSuglia

There is a separation here to differentiate between possible errors from the model and the rest of the inference code. Most of the error handling is inside InferenceRunner._run_edh_instance (here) where exceptions from invocations of the model should get logged in metrics. Are you having specific errors at other places that are not getting caught?

That said, we probably should be returning the list of failed files in InferenceRunner._run() for debugging. I will work on this on Monday.

For leaderboard evaluations, we are separately confirming that all inference files did get saved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants