0

I'm currently working on a parallel and distributed computing project where I'm comparing the performance of XGBoost running on CPU vs GPU. The goal is to demonstrate how GPU acceleration can improve training time, especially when using appropriate parameters.

I've been trying different parameter combinations, but I haven't been able to get the GPU version to significantly outperform the CPU version. In fact, in most cases, the CPU version performs as well or even faster than the GPU version.

    @timer_decorator
    def train_xgboost_cpu(self, X_train, y_train):
        """
        Train XGBoost Classifier on CPU with parameters that perform less efficiently
        """
        print("Training XGBoost Classifier on CPU...")
        xgb_clf = xgb.XGBClassifier(
            n_estimators=1500,        
            max_depth=15,              
            learning_rate=0.01,      
            subsample=0.9,             
            colsample_bytree=0.9,     
            objective='binary:logistic',
            tree_method='hist',          
            n_jobs=self.n_jobs,                 
            random_state=42,
            max_bin=256,               
            grow_policy='depthwise'
            verbosity=1,
            use_label_encoder=False
        )
        
        print(f"Training XGBoost CPU on data shape: {X_train.shape}")
        xgb_clf.fit(X_train, y_train)
        
        return xgb_clf
            

    @timer_decorator
    def train_xgboost_gpu(self, X_train, y_train):
        """
        Train XGBoost Classifier with GPU acceleration optimized for performance
        """
        if not XGB_GPU_AVAILABLE:
            print("XGBoost GPU support not available, falling back to CPU")
            return self.train_xgboost_cpu(X_train, y_train)
            
        # Initialize and train the model with GPU-optimized parameters
        print("Training XGBoost Classifier on GPU...")
        try:
            xgb_clf = xgb.XGBClassifier(
                n_estimators=1500,         
                max_depth=15,                
                learning_rate=0.01,          
                subsample=0.9,               
                colsample_bytree=0.9,        
                objective='binary:logistic',
                tree_method='gpu_hist',      
                predictor='gpu_predictor',   
                grow_policy='depthwise',     
                gpu_id=0,
                random_state=42,
                max_bin=256,                
                verbosity=1,
                use_label_encoder=False
            )
            xgb_clf.fit(X_train, y_train)
            return xgb_clf
        except Exception as e:
            print(f"XGBoost GPU training failed: {e}")
            print("Falling back to CPU training")
            return self.train_xgboost_cpu(X_train, y_train)

Key Details:
Dataset size : ~41,000 rows (small/medium-sized).
Goal : Compare CPU vs GPU performance.
Issue : Despite trying many parameter combinations, the GPU version does not show significant speedup over the CPU version.
Observation : I suspect the dataset size might be too small to fully utilize the GPU, but I have to work with this dataset regardless.

2
  • maybe you could ask on similar portals DataScience, CrossValidated, Artificial Intelligence or forum Kaggle - they may have better experience with ML, NN, AI Commented May 2 at 21:22
  • 1
    @halfer now I was thinking if I should use Maybe you could move question to ... Or if I should add And remeber to delete question on this portal ... - OK, thanks, I will think about it. Commented May 2 at 22:20

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.