Tensorflow: Simple Linear Regression using CSV data

Question

I am an extreme beginner at tensorflow, and i was tasked to do a simple linear regression using my csv data which contains 2 columns, Height & State of Charge(SoC), where both values are float. In CSV file, Height is the first col while SoC is the second col.

Using Height i'm suppose to predict SoC

I'm completely lost as to what i have to add in the "Fit all training data" portion of the code. I've looked at other linear regression models and their codes are mind boggling, such as this one:

with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
    sess.run(training_step,feed_dict={X:train_x,Y:train_y})
    cost_history = np.append(cost_history,sess.run(cost,feed_dict={X: train_x,Y: train_y}))

#calculate mean square error 
pred_y = sess.run(y_, feed_dict={X: test_x})
mse = tf.reduce_mean(tf.square(pred_y - test_y))
print("MSE: %.4f" % sess.run(mse)) 

#plot cost
plt.plot(range(len(cost_history)),cost_history)
plt.axis([0,training_epochs,0,np.max(cost_history)])
plt.show()

fig, ax = plt.subplots()
ax.scatter(test_y, pred_y)
ax.plot([test_y.min(), test_y.max()], [test_y.min(), test_y.max()], 'k--', lw=3)
ax.set_xlabel('Measured')
ax.set_ylabel('Predicted')
plt.show()

I've just been able to get data from my CSV file without error using this guide:

TensorFlow: Reading and using data from CSV file

Full Code:

import tensorflow as tf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
rng = np.random

from numpy import genfromtxt
from sklearn.datasets import load_boston

# Parameters
learning_rate = 0.01
training_epochs = 1000
display_step = 50
n_samples = 221

X = tf.placeholder("float") # create symbolic variables
Y = tf.placeholder("float")

filename_queue = tf.train.string_input_producer(["battdata.csv"],shuffle=False)

reader = tf.TextLineReader(skip_header_lines=1)
key, value = reader.read(filename_queue)

# Default values, in case of empty columns. Also specifies the type of the
# decoded result.
record_defaults = [[1.], [1.]]
col1, col2= tf.decode_csv(
    value, record_defaults=record_defaults)
features = tf.stack([col1])

# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")

# Construct a linear model
pred = tf.add(tf.multiply(col1, W), b) # XW + b <- y = mx + b  where W is gradient, b is intercept

# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-col2, 2))/(2*n_samples)

# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Initializing the variables
init = tf.global_variables_initializer()

with tf.Session() as sess:
    # Start populating the filename queue.
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)
    sess.run(init)

    # Fit all training data
    for epoch in range(training_epochs):
        _, cost_value = sess.run([optimizer,cost])
        for (x, y) in zip(col2, col1):
                sess.run(optimizer, feed_dict={X: x, Y: y})

            #Display logs per epoch step
        if (epoch+1) % display_step == 0:
            c = sess.run(cost, feed_dict={X: col2, Y:col1})
            print( "Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \
                "W=", sess.run(W), "b=", sess.run(b))

        print("Optimization Finished!")
        training_cost = sess.run(cost, feed_dict={X: col2, Y: col1})
        print ("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')

        #Graphic display
        plt.plot(train_X, train_Y, 'ro', label='Original data')
        plt.plot(train_X, sess.run(W) * col2 + sess.run(b), label='Fitted line')
        plt.legend()
        plt.show()

    coord.request_stop()
    coord.join(threads)

Error:

INFO:tensorflow:Error reported to Coordinator: , Attempted to use a closed Session. --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () 8 for epoch in range(training_epochs): 9 _, cost_value = sess.run([optimizer,cost]) ---> 10 for (x, y) in zip(*col1, col2): 11 sess.run(optimizer, feed_dict={X: x, Y: y}) 12

C:\Users\Shiina\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py in iter(self) 514 TypeError: when invoked. 515 """ --> 516 raise TypeError("'Tensor' object is not iterable.") 517 518 def bool(self):

TypeError: 'Tensor' object is not iterable.

Vijayachandran Mariappan · Accepted Answer · 2017-06-28 06:47:36Z

1

The error is because your are trying to iterate over tensors in for (x, y) in zip(col2, col1) which is not allowed. The other issues with the code is that you have input pipeline queues setup and then your also trying to feed in through feed_dict{}, which is wrong. Your training part should look something like this:

with tf.Session() as sess:
# Start populating the filename queue.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
sess.run(init)

# Fit all training data
for epoch in range(training_epochs):
    _, cost_value = sess.run([optimizer,cost])

        #Display logs per epoch step
    if (epoch+1) % display_step == 0:
        c = sess.run(cost)
        print( "Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \
            "W=", sess.run(W), "b=", sess.run(b))

    print("Optimization Finished!")
    training_cost = sess.run(cost)
    print ("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')

#Plot data after completing training
train_X = []
train_Y = []
for i in range(input_size): #Your input data size to loop through once
    X, Y = sess.run([col1, pred]) # Call pred, to get the prediction with the updated weights
    train_X.append(X)
    train_Y.append(y)
    #Graphic display
plt.plot(train_X, train_Y, 'ro', label='Original data')
plt.legend()
plt.show()

coord.request_stop()
coord.join(threads)

answered Jun 28, 2017 at 6:47

Vijayachandran Mariappan

17.2k3 gold badges43 silver badges60 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Tix Over a year ago

Hi! Thanks for your answer. However using your code, the x-axis of the graph im seeing is from value 0-1.0 whereas the csv data that i'm using, the X-axis is supposed to be 0-100.0 :x Did i do something else wrong?

Vijayachandran Mariappan Over a year ago

Check the values of train_X which gets col1 values, and see whether its same as your csv data.

Tix Over a year ago

I checked the value of train_X, and it does not correspond to the data i have in my csv file.

Vijayachandran Mariappan Over a year ago

I just checked with a random generated input, and it works fine. The data your seeing should come from somewhere. Can you share the csv file?

Tix Over a year ago

Here's the download link for both the csv file and the ipy file for the code drive.google.com/open?id=0B8Kt9KpV9HnRT0xWdVdJLWJFdWc

|

Collectives™ on Stack Overflow

Tensorflow: Simple Linear Regression using CSV data

1 Answer 1

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related