0

I've tried the codes provided by Tensorflow here

I've also tried the solution provided by Nicolas, I encountered an error:

ValueError: Shape () must have rank at least 1

but I am incapable of manipulating the code such that I can grab the data and place it in train_X and train_Y variables.

I'm currently using hard coded data for variable train_X and train_Y.

My csv file contains 2 columns, Height & State of Charge(SoC), where height is a float value and SoC is a whole number (Int) starting from 0 with increment of 10 to a maximum of 100.

I want to grab the data from the columns and use it in a linear regression model, where Height is the Y value and SoC is the x value.

Here's my code:

filename_queue = tf.train.string_input_producer("battdata.csv")

reader = tf.TextLineReader()
key, value = reader.read(filename_queue)

# Default values, in case of empty columns. Also specifies the type of the
# decoded result.
record_defaults = [[1], [1]]
col1, col2= tf.decode_csv(
    value, record_defaults=record_defaults)
features = tf.stack([col1, col2])

with tf.Session() as sess:
  # Start populating the filename queue.
  coord = tf.train.Coordinator()
  threads = tf.train.start_queue_runners(coord=coord)

  for i in range(1200):
    # Retrieve a single instance:
    example, label = sess.run([features, col2])

  coord.request_stop()
  coord.join(threads)

I want to change use the csv data in this model:

# Parameters
learning_rate = 0.01
training_epochs = 1000
display_step = 50

# Training Data
train_X = numpy.asarray([3.3,4.4,5.5,6.71,6.93,4.168,9.779,6.182,7.59,2.167,
                         7.042,10.791,5.313,7.997,5.654,9.27,3.1])
train_Y = numpy.asarray([1.7,2.76,2.09,3.19,1.694,1.573,3.366,2.596,2.53,1.221,
                         2.827,3.465,1.65,2.904,2.42,2.94,1.3])
n_samples = train_X.shape[0]

# tf Graph Input
X = tf.placeholder("float")#Charge
Y = tf.placeholder("float")#Height

# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")

# Construct a linear model
pred = tf.add(tf.multiply(X, W), b) # XW + b <- y = mx + b  where W is gradient, b is intercept

# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)
# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
    with tf.Session() as sess:
        sess.run(init)

        # Fit all training data
        for epoch in range(training_epochs):
            for (x, y) in zip(train_X, train_Y):
                sess.run(optimizer, feed_dict={X: x, Y: y})

            #Display logs per epoch step
            if (epoch+1) % display_step == 0:
                c = sess.run(cost, feed_dict={X: train_X, Y:train_Y})
                print( "Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \
                    "W=", sess.run(W), "b=", sess.run(b))

        print("Optimization Finished!")
        training_cost = sess.run(cost, feed_dict={X: train_X, Y: train_Y})
        print ("Training cost=", training_cost, "W=", sess.run(W), "b=", sess.run(b), '\n')

        #Graphic display
        plt.plot(train_X, train_Y, 'ro', label='Original data')
        plt.plot(train_X, sess.run(W) * train_X + sess.run(b), label='Fitted line')
        plt.legend()
        plt.show()

EDIT:

I've also tried the solution provided by Nicolas, I encountered an error:

ValueError: Shape () must have rank at least 1

I solved this issue by adding square brackets around my file name like so:

filename_queue = tf.train.string_input_producer(['battdata.csv'])

2 Answers 2

1

All you need to do is to replace your placeholder tensors by the op you get form the decode_csv method. This way whenever you will run the optimiser, the TensorFlow Graph will ask for a new row to be read from the file through the various Tensor dependencies:

optimiser =>
cost=> pred=> X
cost => Y

It would give something like that:

filename_queue = tf.train.string_input_producer("battdata.csv")

reader = tf.TextLineReader()
key, value = reader.read(filename_queue)

# Default values, in case of empty columns. Also specifies the type of the
# decoded result.
record_defaults = [[1.], [1]]
X, Y = tf.decode_csv(
    value, record_defaults=record_defaults)

# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")

# Construct a linear model
pred = tf.add(tf.multiply(X, W), b) # XW + b <- y = mx + b  where W is gradient, b is intercept

# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)
# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Initializing the variables
init = tf.global_variables_initializer()

with tf.Session() as sess:
  # Start populating the filename queue.
  coord = tf.train.Coordinator()
  threads = tf.train.start_queue_runners(coord=coord)

  # Fit all training data
  for epoch in range(training_epochs):
      _, cost_value = sess.run([optimizer, cost])

   [...] # The rest of your code

  coord.request_stop()
  coord.join(threads) 
Sign up to request clarification or add additional context in comments.

Comments

0

I had the same problem and the problem was resolved like:

tf.train.string_input_producer(tf.train.match_filenames_once("medal.csv"))

Found this here: .TensorFlow From CSV to API

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.