Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Other types of scoring #1

Closed
fbcotter opened this issue Jul 17, 2018 · 6 comments
Closed

Other types of scoring #1

fbcotter opened this issue Jul 17, 2018 · 6 comments

Comments

@fbcotter
Copy link

I like what you're trying to do with this as a simple drop in replacement for sklearn's GridSearchCV. Have you easily got the ability to do other scoring metrics beside accuracy?

@Arian96669
Copy link

Thank you for the great work you've done on this!

I was using GridsearchCV & RandomSearchCV on a small dataset to try to optimise my parameters, and I found that after spending multiple hours of param search on the CV results, that using the 'best' results as identified by GridsearchCV & RandomSearchCV yielded poor results on both my validation and test sets. For instance, the mse using GridsearchCV gave me training_mse of 9k and validation_mse of 10k. But once I tried it on my actual validation and test sets I would get mse's of over 100k! So clearly the 'best' params from using GridSearchCV did not give me the best results.

So in Googling for a way to specify a specific validation set for GridSearchCV I came across hypopt.

Since the problem that I'm working on is a regression problem, I thought it would be nice to use mse as a scoring measure. So I agree with fbcotter, it would be a brilliant idea to add the flexibility to select the scoring method perhaps as a parameter in GridSearch.fit

What I'm currently experimenting with is to use your _score for identifying the top set of parameters, and then I iterate over them, train the model, and using sklearn.metrics.mean_squared_error to get the mse of my train, validation and test sets.

@cgnorthcutt
Copy link
Owner

cgnorthcutt commented Oct 25, 2018

@fbcotter - Hey thanks for your request. Install the latest version (1.0.7) at this time of writing via pip install hypopt. It supports any scoring metric (see the pypi website page or github readme under the "Choosing the scoring metric to optimize" section)

@cgnorthcutt
Copy link
Owner

cgnorthcutt commented Oct 25, 2018

@Arian96669 - This is fully support now :) Please install version (1.0.7). In your case, all you'd have to do is just gs.fit(X_train, y_train, params, X_val, y_val, scoring='neg_mean_squared_error'). You can read more about this on the pypi website page or github readme under the "Choosing the scoring metric to optimize" section. Note that its negative MSE because scoring functions require 'good' stuff to be bigger, so for errors you have to invert, but for accuracy you don't. Enjoy and glad this package is helpful!

@Arian96669
Copy link

Excellent! Awesome! :-) Thank you for updating it Curtis!

@cgnorthcutt
Copy link
Owner

cgnorthcutt commented Oct 25, 2018

@Arian96669 @fbcotter Update, you'll actually want version 1.0.7. Took a last double check today with some added testing and found a missing negative sign in the scoring in 1.0.6. So please install the latest for working mse scoring. Just put it out on pypi for pip install a few minutes ago. Cheers!

@cgnorthcutt
Copy link
Owner

Marking resolved, but feel free to open another comment if any issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants