Is there any purpose of altering neural network architecture if validation loss does not decrease but training loss does?

Ask Question

Asked 1 year, 7 months ago

Modified 1 year, 7 months ago

Viewed 31 times

I am training a transformer based neural network and the validation loss is not decreasing, but the training loss does decrease. I am wondering if it's possible to debug or change the architecture such that this is reversed, or if I definitely need to debug my dataset.

asked Apr 27, 2024 at 18:15

JobHunter69

2331 silver badge10 bronze badges

$\begingroup$ you mean overfitting? $\endgroup$

Alberto
– Alberto

2024-04-28 11:06:57 +00:00
Commented Apr 28, 2024 at 11:06
$\begingroup$ @Alberto Not overfitting. Validation loss never went down at all in the first place $\endgroup$

JobHunter69
– JobHunter69

2024-04-28 15:11:41 +00:00
Commented Apr 28, 2024 at 15:11
$\begingroup$ overfitting never says that you need the val loss to decrease initially $\endgroup$

Alberto
– Alberto

2024-04-28 18:23:37 +00:00
Commented Apr 28, 2024 at 18:23
$\begingroup$ @Alberto Then how do you know if something is overfitting vs if the validation dataset distribution is totally different than training $\endgroup$

JobHunter69
– JobHunter69

2024-04-28 18:33:08 +00:00
Commented Apr 28, 2024 at 18:33
$\begingroup$ you can’t, overfitting means you are not generalizing on the validation set, and by definition if the two distributions are not the same, you are not generalizing $\endgroup$

Alberto
– Alberto

2024-04-28 18:49:43 +00:00
Commented Apr 28, 2024 at 18:49

| Show 4 more comments

0 You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Stack Exchange Network

Is there any purpose of altering neural network architecture if validation loss does not decrease but training loss does?

0

You must log in to answer this question.

Hot Network Questions

Is there any purpose of altering neural network architecture if validation loss does not decrease but training loss does?

0

You must log in to answer this question.

Related

Hot Network Questions