0

I am new to Tensorflow and trying to run a neural network on .csv file with 60 columns. However some of them contain string fields. I tried to run the program I got could not convert string to float: This is the code.

# Load datasets.
  training_set = tf.contrib.learn.datasets.base.load_csv_without_header(
      filename=TRAINING,
      target_dtype=np.int,
      features_dtype=np.float32)

  test_set = tf.contrib.learn.datasets.base.load_csv_without_header(
      filename=TEST,
      target_dtype=np.int,
      features_dtype=np.float32)

  # Specify that all features have real-value data
  feature_columns = [tf.feature_column.numeric_column("x", shape=[59])]


  classifier = tf.estimator.DNNClassifier(feature_columns=feature_columns,
                                          hidden_units=[59],
                                          n_classes=2)

Now I read that target_dtype and features_dtype take on numpy types. I searched here https://docs.scipy.org/doc/numpy/user/basics.types.html and looks like they don't have a string fields. What is the best way to achieve this?

1 Answer 1

2

There are two ways.

First way, you can modify your data in csv, removing strings which can not convert to 'float'. To use the demo code in tf.estimator Quickstart, you should keep your csv format like iris_training.csv or iris_test.csv.

Second way, you can modify the code of the function load_csv_without_header that you called. Original Code like this:

def load_csv_without_header(filename,
                        target_dtype,
                        features_dtype,
                        target_column=-1):
  """Load dataset from CSV file without a header row."""
  with gfile.Open(filename) as csv_file:
    data_file = csv.reader(csv_file)
    data, target = [], []
    for row in data_file:
      target.append(row.pop(target_column))
      data.append(np.asarray(row, dtype=features_dtype))

    target = np.array(target, dtype=target_dtype)
    data = np.array(data)
    return Dataset(data=data, target=target)

Here, it uses some common modules, sucn as csv, numpy, collections, features of python, such as next, enumerate, function in tensorflow, such as gfile. You can debug this code and then modify the code for your data.

Also, you can use the tf.decode_csv.

At the end , welcome to the tensorflow.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.