Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set default value of avoid_duplicate_runs to false for run_model_on_task #1145

Conversation

chadmarchand
Copy link
Contributor

Reference Issue

Closes #1143

What does this PR implement/fix? Explain your changes.

Sets the default value of avoid_duplicate_runs in the run_model_on_task function to False. When true, this option avoids running an experiment that already exists on OpenML, but this requires an API key. This change means that an API key is not required unless explicitly setting avoid_duplicate_runs to true.

How should this PR be tested?

Tested that without an API key, the following code block does not return a 401 error any more after this change has been made.

from sklearn import ensemble
from openml import tasks, runs

clf = ensemble.RandomForestClassifier()
task = tasks.get_task(3954)
run = runs.run_model_on_task(clf, task)

I don't believe that an automated test should be required as we are just changing a default value and not changing any implementation, but please let me know if otherwise and I will add a test.

@PGijsbers
Copy link
Collaborator

Thanks for putting in the effort. After some more thought we have decided that instead we simply want to issue a warning if the user has avoid_duplicate set to true when no apikey is set. In the future we expect flow/exists to be callable without authentication and then we can remove the warning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

run_model_on_task: make avoid_duplicate_runs=False the default
2 participants