5

It seems that GridSearchCV of scikit-learn collects the scores of its (inner) cross-validation folds and then averages across the scores of all folds. I was wondering about the rationale behind this. At first glance, it would seem more flexible to instead collect the predictions of its cross-validation folds and then apply the chosen scoring metric to the predictions of all folds.

The reason I stumbled upon this is that I use GridSearchCV on an imbalanced data set with cv=LeaveOneOut() and scoring='balanced_accuracy' (scikit-learn v0.20.dev0). It doesn't make sense to apply a scoring metric such as balanced accuracy (or recall) to each left-out sample. Rather, I would want to collect all predictions first and then apply my scoring metric once to all predictions. Or does this involve an error in reasoning?

Update: I solved it by creating a custom grid search class based on GridSearchCV with the difference that predictions are first collected from all inner folds and the scoring metric is applied once.

1 Answer 1

1

GridSearchCVuses the scoring to decide what internal hyperparameters to set in the model.

If you want to estimate the performance of the "optimal" hyperparameters, you need to do an additional step of cross validation.

See http://scikit-learn.org/stable/auto_examples/model_selection/plot_nested_cross_validation_iris.html

EDIT to get closer to answering the actual question: For me it seems reasonable to collect predictions for each fold and then score them all, if you want to use LeaveOneOut and balanced_accuracy. I guess you need to make your own grid searcher to do that. You could use model_selection.ParameterGrid and model_selection.KFold for that.

Sign up to request clarification or add additional context in comments.

6 Comments

Yes, but my question is with respect to the way GridSearchCV uses scoring. It separately applies scoring to each inner cross-validation fold and then averages across scorings instead of first collecting the predictions of all inner folds and then applying a scoring metric once. In my case it seems the latter would be more appropriate.
OK, I misunderstood the question then. For me it seems reasonable to collect predictions for each fold and then score them all, if you want to use LeaveOneOut and balanced_accuracy. I guess you need to make your own grid searcher to do that. You could use model_selection.ParameterGrid and model_selection.KFold for that.
Thanks @KPLauritzen, I'll do that!
Great. Please update, so we know if it worked for you :)
@monade hi sorry for bringing up old thread! I would like to implement the same thing, may I ask which function did you override within the GridSearchCV to achieve this?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.