3

I'm developing a fraud detection model using XGBoost.

I cannot share the data (Sorry)

The CPU based model works well and identifies frauds as expected.

The GPU based model identifies a lower number of frauds.

So, given the same level of confidence the GPU based model identifies a much lower number of frauds.

This is the parameters list for the CPU:

params = {"objective":"multi:softprob", 
          'booster':'dart', 
          'max_depth':5, 
          'eta':0.1, 
          'subsample':0.2, 
          'nthread':mp.cpu_count()-1, 
          'eval_metric':'merror', 
          'colsample_bytree':0.2, 
          'num_class':2}

The parameters for the GPU model training are:

params = {"objective":"multi:softprob", 
          'subsample':0.2, 
          'gpu_id':0, 
          'num_class':2, 
          'tree_method':'gpu_hist', 
          'max_depth':5, 
          'eta':0.1, 
          'gamma':1100, 
          'eval_metric':'mlogloss'}

1 Answer 1

1

It is due to usage of different "tree" paramaters. Most probably it is using tree_method='exact' when using the CPU as you haven't given a tree parameter explicitly. You can test this by adding tree_method='exact' to your CPU params list and check whether you are getting a good accuracy same as without it. But you are using tree_method='gpu_hist' when using the GPU. You can find more information on all tree methods at here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.