-
-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
run_model_on_task: make avoid_duplicate_runs=False the default #1143
Comments
I had a closer look, and the problem is actually a openml-python/openml/flows/functions.py Line 256 in 99a62f6
With that fixed, you can run above code without an API key configured while still having the So now the question becomes, should we still prefer to have it turned off by default regardless? |
Also @mfeurer |
That's an interesting question - maybe we can move this flag to the upload/publish function instead? It will serve the same purpose, but slightly improve the user experience as users can still run things without having to worry about duplicate runs. |
The idea of having it here is that the user may avoid unnecessary computation since they can identify there's a duplicate before running the experiment and download the results instead (not so much avoiding duplicates on the server, or so I thought). |
Interesting, I thought it's to avoid duplicate stuff on the server. @joaquinvanschoren would you still like to remove that flag now that @PGijsbers has found a workaround? |
Fixes #1143. This change means that runs results will by default not be fetched from the server, but computed locally. The benefit is that the operation no longer requires an API key or internet connection by default.
Description
run_model_on_task has an option to avoid running experiments that already exist on OpenML, called avoid_duplicate_runs. This, however, requires an API key. It is currently the default, meaning that people can't try out this function without setting their API key.
This creates an unnecessary obstacle, especially for beginners who don't know that the avoid_duplicate_runs option can be switched off.
Steps/Code to Reproduce
Expected Results
The model should just run. The user may have no intention to upload the run to OpenML later.
Actual Results
An API key error is thrown
Versions
Linux-5.4.188+-x86_64-with-Ubuntu-18.04-bionic
Python 3.7.13 (default, Apr 24 2022, 01:04:09)
[GCC 7.5.0]
NumPy 1.21.6
SciPy 1.4.1
Scikit-Learn 1.0.2
OpenML 0.12.2
The text was updated successfully, but these errors were encountered: