Questions tagged [hyperparameter-tuning]
Hyperparameter tuning (also called hyperparameter optimization) refers to the process of finding the optimal set of hyperparameters for a given machine learning algorithm.
277 questions
0
votes
0
answers
16
views
How to handle unstable best_iteration in LightGBM when using Optuna for hyperparameter optimization?
I'm using Optuna to optimize LightGBM hyperparameters, and I'm running into an issue with the variability of best_iteration across different random seeds.
Current ...
3
votes
1
answer
101
views
K Fold Cross Validation - Manual Exploration or Use the result?
I have a dataset which I split into training, testing, and out-of-time sets. Then I feed my training set into K Fold CV.
I understand that K Fold Cross Validation is used as a method to select the &...
0
votes
1
answer
21
views
Should Hyperparameter Optimization Be Equalized by Trials or Compute Time?
Say I have two different models with different hyperparameters and I want to compare the performance of both models on some dataset.
One model is much simpler than the other and, therefore, if I were ...
3
votes
1
answer
41
views
How Do You Balance Feature Search Strategy and HP Optimization Cost?
What I’m trying to figure out
I'm working on a machine learning project and would love to hear your thoughts on two things:
A. How to prioritize feature exploration
B. Whether to fix hyperparameters (...
1
vote
0
answers
11
views
How to interpret an unstable learning curve on a model tuned with Hyberband Tuning?
I have used Hyperband automatic tuning for an ANN model to predict price. After running the model with the automatic tuning, I am obtaining an R2 score of 1.00 that suggests overfitting, however, I am ...
4
votes
0
answers
55
views
How can I improve mAP when optimizing YOLOv8 hyperparameters with metaheuristic algorithms (e.g., GWO)?
I am working on hyperparameter optimization for YOLOv8 using a metaheuristic algorithm. Currently, I am testing the ...
4
votes
3
answers
107
views
Can cross validation for tuning and LOO for evaluation on the exact same dataset cause bias?
I read two articles by the same guy where he uses the whole dataset for hyperparameter optimisation using with CV and then evaluates the model with the best hyperparameters using leave one out on the ...
2
votes
0
answers
94
views
Why can monotonic feature transformation influence the performance of hypeparam-tuned tree-based models (e.g., random forest)?
I recently observed something unexpected: Although monotonic feature transformation does not affect the performance of decision tree-based models with default hyperparameters, it actually does affect ...
0
votes
0
answers
44
views
Which hyperparameters for a standard LLM provide the most benefit vs performance cost?
GPT3 has several hyper-parameters that define the network architecture. My question is: which of these hyper-parameters, when increased, provide the most performance benefit vs computational cost? ...
1
vote
0
answers
52
views
Choosing the number of features via cross-validation
I have an algorithm that trains a binary predictive model for a specified number of features from the dataset (features are all of the same type, but not all important.) Thus, the number of features ...
1
vote
1
answer
138
views
Should I Use the Same Hyperparameters for Different Datasets in ML Models?
I am a student and am looking for your help.
I have two datasets, including pre-treatment CT scan and post-treatment CT scan. I want to compare these datasets to determine which yields the best ...
1
vote
1
answer
105
views
RFECV and grid search - what sets to use for hyperparameter tuning?
I am running machine learning models (all with sci-kit learn estimators, no neural networks) using a custom dataset with a number of features and binomial output. I first split the dataset into 0.6 (...
1
vote
0
answers
33
views
Error in plotting Gaussian Process for 3 models that use Bayesian Optimization
I'm writing a python script for Orange Data Mining to plot the gaussian processes in order to find the best hyperparameters for the 5-FoldCrossValidation Accuracy metric. The three models are SVC, ...
0
votes
1
answer
168
views
If min_sample_leaf is greater than min_sample_split in decsion tree will it be a problem?
I am tuning the hyperparameter of the decision tree for a data set of 550 samples. As I am comparatively new in hyperparameter tuning(I am learning and implementing), I am confused about what values ...
1
vote
0
answers
55
views
What is the standard ML pipeline for training and testing? [closed]
I have a dataframe containing 1324 rows and 28 columns and I'm kinda lost on which approach to go for when training regression models. Currently I perform a data split and run GridSearchCV to pick the ...
1
vote
0
answers
109
views
Confused about use of random states for training models in scikit
I am new to ML and currently working on improving the accuracy of an MLPClassifier in scikit. My code looks like so
...
1
vote
0
answers
56
views
Hyper parameter tuning LSTM network on time series data
I am trying to train LSTM model (containing four LSTM layers (500 units each) and three droupouts and a fully connected output layer to do regression) on timeseries data. To start with, I tried to ...
1
vote
0
answers
28
views
Why I am requiring tiny learning rate to overfit the model?
I am trying to train LSTM model on a timeseries data with 1.6 million records. I have taken window size of 200.
Initially I tried to overfit the model (train data = test data) on tiny dataset (few ...
1
vote
1
answer
55
views
XGB find hyperparameters and then crossvalidation
I want to train an XGBoost model, and here's how I believe the process should go:
Step 1: Find the optimal hyperparameters using GridSearchCV.
Step 2: Evaluate the selected parameters.
My question is: ...
0
votes
1
answer
29
views
Tuning NonHyperparameters in Scikitlearn
In Scikit Learn RandomSearch or GridSearch , how to include non hyper parameters in the tuning process?! Non hyper parameters are parameters not related to the machine learning algorithms. For example ...
2
votes
1
answer
189
views
The best algorithm(s) for finding the best hyperparameters (special case)
I would like to ask for help with the following.
Given the following dataset, which I have split into train and test sets:
...
1
vote
1
answer
73
views
How do I automate testing and comparison of the performance of models with different layer depths, layer types, and unit counts?
I am testing the effects of different layer counts/depths, unit counts, and layer types for natural language processing. I made a Kaggle notebook where I manually create different layers and then ...
2
votes
1
answer
1k
views
How to select the optimal beam size for beam search?
Most Text Generation Models use beam search to select the optimal output candidate. How does one choose the optimal beam size? It would probably vary from task to task, dataset to dataset, and model ...
1
vote
2
answers
641
views
Feature selection or hyperparameter tuning first for 30 feature data
I have about 30 variables and trying to create a Random Forest model. All the variables are expected to be predictors of outcome. I want to find the best model based on a C-stat score with any number ...
1
vote
1
answer
208
views
Why is it so common to focus only validation performance during hyper-parameter optimization
Assuming a standard train/validation/test split, the common practice is (a) to train multiple models with different hyper-parameter configurations on the training set, (b) to evaluate these models ...
1
vote
1
answer
47
views
Hyperparameter tuning
Jane trains three different classifiers: Logistic Regression, Decision Tree, and Support Vector Machines on the
training set. Each classifier has one hyper-parameter (regularisation parameter, depth-...
0
votes
1
answer
79
views
Add tuning stage to DVC pipeline
I have an ML pipeline built with DVC that I use for experiment tracking. This allows running and tracking several experiments. Also, using hydra integration I can grid search hyper parameters. However,...
1
vote
1
answer
745
views
Ordering of Train/Val/Test set use in hyperparameter tuning
The way I read almost lots of ML advice on these datasets sounds like "You train a model that's randomly chosen hyperparameters first on the training set, then you ignore this bit of the work, ...
-1
votes
1
answer
76
views
For some reason getting an odd error when hyper tuning my model
I was hoping to hypertune my decisiontree model , however I keep running into this error:
TypeError: DecisionTreeClassifier() got an unexpected keyword argument 'criterion'
here what I tried:
...
1
vote
1
answer
99
views
How are the successive sets of training samples that are allocated for each iteration of HalvingGridSearchCV determined?
The scikit-learn classes HalvingGridSearchCV and HalvingRandomSearchCV implement a hyperparameter tuning method known as successive halving. It is an iterative selection process in which all the ...
0
votes
1
answer
67
views
Cross validation and train_test_split
I am building a class that follows the workflow:
Model Selection and Fitting
The class accepts a list of models and their respective hyperparameter grids. It then performs a standard fitting process ...
2
votes
1
answer
109
views
Changing model architecture doesn't impact results
I am currently learning binary classification.
The problem is classifying positive and negative movie reviews.
The dataset is 25,000 reviews with each review represented by 10,000 of the most used ...
1
vote
2
answers
114
views
Which hyperparameters are returned as best in cross validation?
The description on the RandomizedSearchCV says this about best hyperparameters :
"Estimator that was chosen by the search, i.e. estimator which gave highest ...
0
votes
1
answer
817
views
Is hyperparameter tuning done on training or validation data set?
Is hyperparameter tuning done on training or validation data set? The post here gives mixed opinion as of whether the training set should be used for hyperparameter tuning. And I would like to know ...
0
votes
0
answers
71
views
0
votes
1
answer
46
views
Flow of machine learning model including code
I'm towards the completion of my first data science project that will go into my GitHub portfolio.
I'll be happy for some clarification regarding the machine learning models section:
I got a little ...
0
votes
1
answer
819
views
Is it mandatory to set a random_state when using RandomizedSearchCV?
When I use RandomizedSearchCV, if I put the random state I always obtain the same results with the same hyperparams trainer. So, is it mandatory to use? Because in my opinion it is better to always ...
0
votes
0
answers
131
views
Is it possible to perform probability calibration with a model with the best hyperparams?
If I use RandomizedSearchCV to find the optimal hyperparams of a model, can I create another model, with those parameters, to calibrate probabilities using CalibratedClassifierCV?
The new model is not ...
2
votes
1
answer
889
views
Optuina pruning during CrossValidation, does it make sense?
I'm currently trying to build a model using CatBoost. For my parameter tuning, I'm using optuna and cross-validation and pruning the trial checking on the intermediate cross-validation scores. Here ...
1
vote
1
answer
306
views
Can I fit a model with the parameters found with RandomizedSearchCV?
I want to ask you a question.
Suppose I use the following RandomizedSearchCV to find the model's best hyperparams:
...
0
votes
1
answer
96
views
Question about grid search and KFold
I am trying an example which I am training on a huge dataset 5M (only 4 features) rows with Cudf and CUml and I am using SGD logistic regression because I must predict if the patient if is sick or not ...
1
vote
1
answer
145
views
Which of 2 options is better practice for model optimization: 1) Nested CV wrongly averaging inner CV scores. 2) Two successive CVs on X_all. Altrntv?
Goal: Compare preprocessing methods, models, and hyperparameters without leaking into the final generalization estimate, applying cross-validation (cv), i.e. NOT applying any fixed train/test splits.
...
0
votes
1
answer
698
views
Tuned model has higher CV accuracy, but a lower test accuracy. Should I use the tuned or untuned model?
I am working on a classification problem using Sci Kit Learn and am confused on how to properly tune hyper parameters to get the "best" model.
Before any tuning, my logistic regression ...
0
votes
1
answer
60
views
Train/val/test approach for hyperparameter tuning
When looking to train a model, does it make sense to have a 60-20-20 train val test split, first hyper parameter tuning over the training dataset, using the validation set, picking the best model. ...
1
vote
1
answer
886
views
How to determine which combinations of parameters to include in GridSearchCV
I am using MLPClassifier from sklearn and I would like to tune it with GridSearchCV. But I don't know which set of values to include for hidden_layer_sizes, max_iter, activation, solver, etc. How can ...
0
votes
0
answers
2k
views
Got this error from Keras Tuner: Number of consecutive failures excceeded the limit of 3
I'm getting this error when I try to use Keras Tuner with my model:
Number of consecutive failures excceeded the limit of 3. .... KeyError: 'mean_squared_error'
Here's my code:
...
1
vote
1
answer
29
views
Underfitting and perfomance metrics in unsupervised methods
My question is simple and yet quite hard to find an answer to. In an unsupervised method, for example, when you have to reconstruct an input, how can you tell if your loss is good enough? Generally, ...
0
votes
2
answers
1k
views
Is there any benefit to using cross validation from the XGBoost library over sklearn when tuning hyperparameters?
The XGBoost library has its own implementation of cross validation through xgboost.cv(). It looks like it requires data be stored as a DMatrix.
Instead of using <...
0
votes
2
answers
72
views
Optimization of the entire model development process
I want to perform a global optimization of the entire model development pipeline. I have several stages of development, each of which can be performed automatically: preprocessing, removal of outliers/...
2
votes
2
answers
3k
views
Grid_search (RandomizedSearchCV) extremely slow with SVM (SVC)
I'm testing hyperparameters for an SVM, however, when I resort to Gridsearch or RandomizedSearchCV, I haven't been able to get a resolution, because the processing time is exceeding hours.
My dataset ...