2

I am making a simple PyTorch neural net to approximate the sine function on x = [0, 2pi]. This is a simple architecture I use with different deep learning libraries to test whether I understand how to use it or not. The neural net, when untrained, always produces a straight horizontal line, and when trained, produces a straight line at y = 0. In general, it always produces a straight line at y = (The mean of the function). This leads me to believe something is wrong with the forward prop portion of it, as the boundary should not just be a straight line when untrained. Here is the code for the net:

class Net(nn.Module):
    def __init__(self):
      super(Net, self).__init__()
      self.model = nn.Sequential(
      nn.Linear(1, 20),
      nn.Sigmoid(),
      nn.Linear(20, 50),
      nn.Sigmoid(),
      nn.Linear(50, 50),
      nn.Sigmoid(),
      nn.Linear(50, 1)
      )

    def forward(self, x):
        x = self.model(x)
        return x

Here is the training loop

def train(net, trainloader, valloader, learningrate, n_epochs):
    net = net.train()
    loss = nn.MSELoss()
    optimizer = torch.optim.SGD(net.parameters(), lr = learningrate)

    for epoch in range(n_epochs):

        for X, y in trainloader:
            X = X.reshape(-1, 1)
            y = y.view(-1, 1)
            optimizer.zero_grad()

            outputs = net(X)

            error   = loss(outputs, y)
            error.backward()
            #net.parameters()  net.parameters() * learningrate
            optimizer.step()

        total_loss = 0
        for X, y in valloader:
            X = X.reshape(-1, 1).float()
            y = y.view(-1, 1)
            outputs = net(X)
            error   = loss(outputs, y)
            total_loss += error.data

        print('Val loss for epoch', epoch, 'is', total_loss / len(valloader) )

it is called as:

net = Net()
losslist = train(net, trainloader, valloader, .0001, n_epochs = 4)

Where trainloader and valloader are the training and validation loaders. Can anyone help me see what's wrong with this? I know its not the learning rate since its the one I use in other frameworks, and I know its not the fact im using SGD or sigmoid activation functions, although I have a suspicion the error is in the activation functions somewhere.

Does anyone know how to fix this? Thanks.

2 Answers 2

1

After a while playing with some hyperparameters, modifying the net and changing the optimizer (following this excellent recipe) I ended up with changing the line optimizer = torch.optim.SGD(net.parameters(), lr = learningrate) to optimizer = torch.optim.Adam(net.parameters()) (the default optimizer parameters was used), running for 100 epochs and batch size equal to 1.

The following code was used (tested on CPU only):

import torch
import torch.nn as nn
from torch.utils import data
import numpy as np
import matplotlib.pyplot as plt

# for reproducibility
torch.manual_seed(0)
np.random.seed(0)

class Dataset(data.Dataset):

    def __init__(self, init, end, n):

        self.n = n
        self.x = np.random.rand(self.n, 1) * (end - init) + init
        self.y = np.sin(self.x)

    def __len__(self):

        return self.n

    def __getitem__(self, idx):

        x = self.x[idx, np.newaxis]
        y = self.y[idx, np.newaxis]

        return torch.Tensor(x), torch.Tensor(y)


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.model = nn.Sequential(
        nn.Linear(1, 20),
        nn.Sigmoid(),
        nn.Linear(20, 50),
        nn.Sigmoid(),
        nn.Linear(50, 50),
        nn.Sigmoid(),
        nn.Linear(50, 1)
        )

    def forward(self, x):
        x = self.model(x)
        return x

def train(net, trainloader, valloader, n_epochs):

    loss = nn.MSELoss()
    # Switch the two following lines and run the code
    # optimizer = torch.optim.SGD(net.parameters(), lr = 0.0001)
    optimizer = torch.optim.Adam(net.parameters())

    for epoch in range(n_epochs):

        net.train()
        for x, y in trainloader:
            optimizer.zero_grad()
            outputs = net(x).view(-1)
            error   = loss(outputs, y)
            error.backward()
            optimizer.step()

        net.eval()
        total_loss = 0
        for x, y in valloader:
            outputs = net(x)
            error   = loss(outputs, y)
            total_loss += error.data

        print('Val loss for epoch', epoch, 'is', total_loss / len(valloader) )    

    net.eval()

    f, (ax1, ax2) = plt.subplots(1, 2, sharey=True)

    def plot_result(ax, dataloader):
        out, xx, yy = [], [], []
        for x, y in dataloader:
            out.append(net(x))
            xx.append(x)
            yy.append(y)
        out = torch.cat(out, dim=0).detach().numpy().reshape(-1)
        xx = torch.cat(xx, dim=0).numpy().reshape(-1)
        yy = torch.cat(yy, dim=0).numpy().reshape(-1)
        ax.scatter(xx, yy, facecolor='green')
        ax.scatter(xx, out, facecolor='red')
        xx = np.linspace(0.0, 3.14159*2, 1000)
        ax.plot(xx, np.sin(xx), color='green')

    plot_result(ax1, trainloader)
    plot_result(ax2, valloader)
    plt.show()


train_dataset = Dataset(0.0, 3.14159*2, 100)
val_dataset = Dataset(0.0, 3.14159*2, 30)

params = {'batch_size': 1,
          'shuffle': True,
          'num_workers': 4}

trainloader = data.DataLoader(train_dataset, **params)
valloader = data.DataLoader(val_dataset, **params)

net = Net()
losslist = train(net, trainloader, valloader, n_epochs = 100)        

Result with Adam optimizer: enter image description here

Result with SGD optimizer: enter image description here

Sign up to request clarification or add additional context in comments.

Comments

1

In general, it always produces a straight line at y = (The mean of the function).

Usually, this means that the NN has only successfully trained the final layer so far. You need to train it for longer or with better optimizations, as ViniciusArruda shows here.

Edit: To explain further.. When only the final layer has been trained, the NN is effectively trying to guess the output y with no knowledge of the input X. In this case, the best guess it can make is the mean value. That way, it can minimize its MSE loss.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.