I'm currently working on a parallel and distributed computing project where I'm comparing the performance of XGBoost running on CPU vs GPU. The goal is to demonstrate how GPU acceleration can improve training time, especially when using appropriate parameters.
I've been trying different parameter combinations, but I haven't been able to get the GPU version to significantly outperform the CPU version. In fact, in most cases, the CPU version performs as well or even faster than the GPU version.
@timer_decorator
def train_xgboost_cpu(self, X_train, y_train):
"""
Train XGBoost Classifier on CPU with parameters that perform less efficiently
"""
print("Training XGBoost Classifier on CPU...")
xgb_clf = xgb.XGBClassifier(
n_estimators=1500,
max_depth=15,
learning_rate=0.01,
subsample=0.9,
colsample_bytree=0.9,
objective='binary:logistic',
tree_method='hist',
n_jobs=self.n_jobs,
random_state=42,
max_bin=256,
grow_policy='depthwise'
verbosity=1,
use_label_encoder=False
)
print(f"Training XGBoost CPU on data shape: {X_train.shape}")
xgb_clf.fit(X_train, y_train)
return xgb_clf
@timer_decorator
def train_xgboost_gpu(self, X_train, y_train):
"""
Train XGBoost Classifier with GPU acceleration optimized for performance
"""
if not XGB_GPU_AVAILABLE:
print("XGBoost GPU support not available, falling back to CPU")
return self.train_xgboost_cpu(X_train, y_train)
# Initialize and train the model with GPU-optimized parameters
print("Training XGBoost Classifier on GPU...")
try:
xgb_clf = xgb.XGBClassifier(
n_estimators=1500,
max_depth=15,
learning_rate=0.01,
subsample=0.9,
colsample_bytree=0.9,
objective='binary:logistic',
tree_method='gpu_hist',
predictor='gpu_predictor',
grow_policy='depthwise',
gpu_id=0,
random_state=42,
max_bin=256,
verbosity=1,
use_label_encoder=False
)
xgb_clf.fit(X_train, y_train)
return xgb_clf
except Exception as e:
print(f"XGBoost GPU training failed: {e}")
print("Falling back to CPU training")
return self.train_xgboost_cpu(X_train, y_train)
Key Details:
Dataset size : ~41,000 rows (small/medium-sized).
Goal : Compare CPU vs GPU performance.
Issue : Despite trying many parameter combinations, the GPU version does not show significant speedup over the CPU version.
Observation : I suspect the dataset size might be too small to fully utilize the GPU, but I have to work with this dataset regardless.