5

How to decode a csv file with long lines(e.g., with many items per line so as not realistic to list them one by one for output) with tf.TextLineReader() and tf.decode_csv?

The typical usage is:

reader = tf.TextLineReader()    
key, value = reader.read(filename_queue)    
record_defaults = [1,1,1,1,1]    
a,b,c,d,e =  tf.decode_csv(records=value,record_defaults=record_defaults, field_delim=" ")

When we have thousands of items in a line, it's impossible to assign them one by one as (a,b,c,d,e) above, can all the items be decoded to a list or something like that?

3 Answers 3

3

Lets say you have 1800 columns of data. You can use this as record default:

record_defaults=[[1]]*1800

and then use

all_columns = tf.decode_csv(value, record_defaults=record_defaults)

to read them.

Sign up to request clarification or add additional context in comments.

Comments

0

Well, tf.decode_csv returns a list, so you can simply do:

record_defaults = [[1], [1], [1], [1], [1]]
all_columns = tf.decode_csv(value, record_defaults=record_defaults)
all_columns
Out: [<tf.Tensor 'DecodeCSV:0' shape=() dtype=int32>,
 <tf.Tensor 'DecodeCSV:1' shape=() dtype=int32>,
 <tf.Tensor 'DecodeCSV:2' shape=() dtype=int32>,
 <tf.Tensor 'DecodeCSV:3' shape=() dtype=int32>,
 <tf.Tensor 'DecodeCSV:4' shape=() dtype=int32>
]

You can then evaluate it as usual:

sess = tf.Session() 
sess.run(all_columns)
Out: [1, 1, 1, 1, 1]

Note that you need to pass a rank 1 record_defaults. If you have some problems with hanging queue.

Comments

0

Here is the way I am mixing differents dtypes in the record_defaults:

record_defaults = [tf.constant(.1, dtype=tf.float32) for count in range(100)] # 5 fp32 features
record_defaults.extend([tf.constant(1, dtype=tf.int32) for count in range(2)]) # 2 int32 features

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.