0

I have a csv file, where each line is in the form

"0,0,0,0,0,1,0,0,0,20,0,17,0,0"

I have tried to read the data in with this function

    def decode_csv(line):
        line_split = tf.string_split([line], ',')
        features= tf.string_to_number(line_split.values[:-1], tf.int32)
        label= tf.string_to_number(line_split.values[-1], tf.int32)
        return features, label

    dataset =tf.data.TextLineDataset("Documents/t1.csv").skip(1).map(decode_csv3)
    dataset=dataset.shuffle(buffer_size=2).repeat(-1).batch(2)
    dataset_init=dataset.make_initializable_iterator()
    x,y= dataset_init.get_next()

I want to convert each line to the form

    [0,0,0,0,0,1,0,0,0,20,0,17,0]

for x

and

[0]

for y

I am receiving the error

 invalidArgumentError (see above for traceback): StringToNumberOp could not correctly convert string: "0
 [[Node: StringToNumber = StringToNumber[out_type=DT_FLOAT](strided_slice)]]
 [[Node: IteratorGetNext_25 = IteratorGetNext[output_shapes=[[?,?], [?]], output_types=[DT_FLOAT, DT_INT32], _device="/job:localhost/replica:0/task:0/device:CPU:0"](Iterator_25)]]

2 Answers 2

3

It looks like you need to strip the double quotes from the string.

Try this:

def decode_csv(line):
    line = line.strip('\"')
    line_split = tf.string_split([line], ',')
    features= tf.string_to_number(line_split.values[:-1], tf.int32)
    label= tf.string_to_number(line_split.values[-1], tf.int32)
    return features, label

dataset =tf.data.TextLineDataset("Documents/t1.csv").skip(1).map(decode_csv3)
dataset=dataset.shuffle(buffer_size=2).repeat(-1).batch(2)
dataset_init=dataset.make_initializable_iterator()
x,y= dataset_init.get_next()
Sign up to request clarification or add additional context in comments.

Comments

0

using the idea form @agillgilla, I got this to work

def decode_csv(line):
    line = tf.py_func(lambda x: x.decode("utf-8").strip('"'), [line], tf.string)
    line_split = tf.string_split([line], ',')
    features= tf.string_to_number(line_split.values[:-1], tf.int32)
    label= tf.string_to_number(line_split.values[-1], tf.int32)
    return features, label

dataset =tf.data.TextLineDataset("Documents/t1.csv").skip(1).map(decode_csv3)
dataset=dataset.shuffle(buffer_size=2).repeat(-1).batch(2)
dataset_init=dataset.make_initializable_iterator()
x,y= dataset_init.get_next()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.