I am training a transformer based neural network and the validation loss is not decreasing, but the training loss does decrease. I am wondering if it's possible to debug or change the architecture such that this is reversed, or if I definitely need to debug my dataset.
$\begingroup$
$\endgroup$
9
-
$\begingroup$ you mean overfitting? $\endgroup$Alberto– Alberto2024-04-28 11:06:57 +00:00Commented Apr 28, 2024 at 11:06
-
$\begingroup$ @Alberto Not overfitting. Validation loss never went down at all in the first place $\endgroup$JobHunter69– JobHunter692024-04-28 15:11:41 +00:00Commented Apr 28, 2024 at 15:11
-
$\begingroup$ overfitting never says that you need the val loss to decrease initially $\endgroup$Alberto– Alberto2024-04-28 18:23:37 +00:00Commented Apr 28, 2024 at 18:23
-
$\begingroup$ @Alberto Then how do you know if something is overfitting vs if the validation dataset distribution is totally different than training $\endgroup$JobHunter69– JobHunter692024-04-28 18:33:08 +00:00Commented Apr 28, 2024 at 18:33
-
$\begingroup$ you can’t, overfitting means you are not generalizing on the validation set, and by definition if the two distributions are not the same, you are not generalizing $\endgroup$Alberto– Alberto2024-04-28 18:49:43 +00:00Commented Apr 28, 2024 at 18:49
|
Show 4 more comments