1

I am using TensorFlow 0.10.0rc0. I have CUDA Driver Version = 7.5 and CUDNN 4 on Ubuntu 14.04.

I have a simple CSV file which has a single line like this:

"field with
newline",0

where the newline has been added by pressing the enter key in VIM on Ubuntu. I am able to read this file in pandas using the read_csv function, where the text field is shown as containing a single \n character.

But when I try to read it in TensorFlow, I get the following error:

tensorflow.python.framework.errors.InvalidArgumentError: Quoted field has to end with quote followed by delim or end

My tensor flow code to read CSV uses this function to read a single row:

def read_single_example(filename_queue, skip_header_lines, record_defaults, feature_index, label_index):
    reader = tf.TextLineReader(skip_header_lines=skip_header_lines)
    key, value = reader.read(filename_queue)
    record = tf.decode_csv(
        value,
        record_defaults=record_defaults)
    features, label = record[feature_index], record[label_index]
    return features, label

If I read using pandas and replace all newlines with spaces, the TensorFlow code is able to parse the CSV successfully.

But it will be really helpful if newlines can be handled within the TensorFlow CSV pipeline itself.

1
  • the rfc4180 specs says its allowed, and python's default dialect of csv is 'excel' which should be capable though. Commented Aug 17, 2016 at 12:55

2 Answers 2

1

The issue here is that TextLineReader splits the file on new lines, before it is parsed by the csv decoder. With tf.data, you can use tf.contrib.data.CsvDataset, which parses this file correctly according to RFC4180.

Sign up to request clarification or add additional context in comments.

Comments

0

TensorFlow's CSV reader is pretty strict, in my experience with it, with regards to RFC4180.

Making sure your files use CRLF at the end of each line, as well as in quoted fields, should allow processing.

Note: I have been using this up to 0.9 so far. I did not try on RCs from 0.10.

3 Comments

The same error exists in Tensorflow 0.9 also. I have given my entire csv file above in the question. As far as I know, this csv file is obeys RFC4180. Please see @you 's comment above.
Sorry for the long blank. Have you resolved your issue? An update would be great, like an answer to your own question. I did not manage to reproduce now.
Hi @EricPlaton I raised the issue in Tensorflow github and I assume they are working on it. github.com/tensorflow/tensorflow/issues/3851

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.