-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Other types of scoring #1
Comments
Thank you for the great work you've done on this! I was using GridsearchCV & RandomSearchCV on a small dataset to try to optimise my parameters, and I found that after spending multiple hours of param search on the CV results, that using the 'best' results as identified by GridsearchCV & RandomSearchCV yielded poor results on both my validation and test sets. For instance, the mse using GridsearchCV gave me training_mse of 9k and validation_mse of 10k. But once I tried it on my actual validation and test sets I would get mse's of over 100k! So clearly the 'best' params from using GridSearchCV did not give me the best results. So in Googling for a way to specify a specific validation set for GridSearchCV I came across hypopt. Since the problem that I'm working on is a regression problem, I thought it would be nice to use mse as a scoring measure. So I agree with fbcotter, it would be a brilliant idea to add the flexibility to select the scoring method perhaps as a parameter in GridSearch.fit What I'm currently experimenting with is to use your _score for identifying the top set of parameters, and then I iterate over them, train the model, and using sklearn.metrics.mean_squared_error to get the mse of my train, validation and test sets. |
@fbcotter - Hey thanks for your request. Install the latest version (1.0.7) at this time of writing via |
@Arian96669 - This is fully support now :) Please install version (1.0.7). In your case, all you'd have to do is just gs.fit(X_train, y_train, params, X_val, y_val, scoring='neg_mean_squared_error'). You can read more about this on the pypi website page or github readme under the "Choosing the scoring metric to optimize" section. Note that its negative MSE because scoring functions require 'good' stuff to be bigger, so for errors you have to invert, but for accuracy you don't. Enjoy and glad this package is helpful! |
Excellent! Awesome! :-) Thank you for updating it Curtis! |
@Arian96669 @fbcotter Update, you'll actually want version 1.0.7. Took a last double check today with some added testing and found a missing negative sign in the scoring in 1.0.6. So please install the latest for working mse scoring. Just put it out on pypi for pip install a few minutes ago. Cheers! |
Marking resolved, but feel free to open another comment if any issues. |
I like what you're trying to do with this as a simple drop in replacement for sklearn's GridSearchCV. Have you easily got the ability to do other scoring metrics beside accuracy?
The text was updated successfully, but these errors were encountered: