0
$\begingroup$

I am training a transformer based neural network and the validation loss is not decreasing, but the training loss does decrease. I am wondering if it's possible to debug or change the architecture such that this is reversed, or if I definitely need to debug my dataset.

$\endgroup$
9
  • $\begingroup$ you mean overfitting? $\endgroup$ Commented Apr 28, 2024 at 11:06
  • $\begingroup$ @Alberto Not overfitting. Validation loss never went down at all in the first place $\endgroup$ Commented Apr 28, 2024 at 15:11
  • $\begingroup$ overfitting never says that you need the val loss to decrease initially $\endgroup$ Commented Apr 28, 2024 at 18:23
  • $\begingroup$ @Alberto Then how do you know if something is overfitting vs if the validation dataset distribution is totally different than training $\endgroup$ Commented Apr 28, 2024 at 18:33
  • $\begingroup$ you can’t, overfitting means you are not generalizing on the validation set, and by definition if the two distributions are not the same, you are not generalizing $\endgroup$ Commented Apr 28, 2024 at 18:49

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.