4

This is the code used to convert data to TFRecord

def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

 def _bytes_feature(value):
   return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

def _floats_feature(value):
   return tf.train.Feature(float_list=tf.train.FloatList(value=value))

with tf.python_io.TFRecordWriter("train.tfrecords") as writer:
    for row in train_data:
        prices, label, pip = row[0],row[1],row[2]
        prices = np.asarray(prices).astype(np.float32)
        example = tf.train.Example(features=tf.train.Features(feature={
                                           'prices': _floats_feature(prices),
                                           'label': _int64_feature(label[0]),
                                           'pip': _floats_feature(pip)
    }))
        writer.write(example.SerializeToString())

Feature prices is an array of shape(1,288). It converted successfully! But when decoded the data using a parse function and Dataset API.

def parse_func(serialized_data):
    keys_to_features = {'prices': tf.FixedLenFeature([], tf.float32),
                    'label': tf.FixedLenFeature([], tf.int64)}

    parsed_features = tf.parse_single_example(serialized_data, keys_to_features)
    return parsed_features['prices'],tf.one_hot(parsed_features['label'],2)

It gave me the error

C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\framework\op_kernel.cc:1202] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: prices. Can't parse serialized Example. 2018-03-31 15:37:11.443073: W C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\framework\op_kernel.cc:1202] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: prices. Can't parse serialized Example. 2018-03-31 15:37:11.443313: W C:\tf_jenkins\workspace\rel-win\M\windows-gpu\ raise type(e)(node_def, op, message) PY\36\tensortensorflow.python.framework.errors_impl.InvalidArgumentError: Key: prices. Can't parse serialized Example. [[Node: ParseSingleExample/ParseSingleExample = ParseSingleExample[Tdense=[DT_INT64, DT_FLOAT], dense_keys=["label", "prices"], dense_shapes=[[], []], num_sparse=0, sparse_keys=[], sparse_types=[]](arg0, ParseSingleExample/Const, ParseSingleExample/Const_1)]] [[Node: IteratorGetNext_1 = IteratorGetNextoutput_shapes=[[?], [?,2]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]fl ow\core\framework\op_kernel.cc:1202] OP_REQUIRES failed at example_parsing_ops.cc:240 : Invalid argument: Key: prices. Can't parse serialized Example.

5 Answers 5

10

I found the problem. Instead of using tf.io.FixedLenFeature for parsing an array, use tf.io.FixedLenSequenceFeature
(for TensorFlow 1, use tf. instead of tf.io.)

Sign up to request clarification or add additional context in comments.

4 Comments

Can you elaborate a little bit more on how this actually solved your problem and what you code looks like that works?
@JonDeaton As I mentioned above, I got the error when using FixedLenFeatures. But it worked when I changed to FixedLenSequenceFeature. Prices is a 1D array. For encoding to tfrecord, I used def _floats_feature(value): return tf.train.Feature(float_list=tf.train.FloatList(value=value)) and for decode : keys_to_features = {'prices': tf.FixedLenSequenceFeature([],dtype=tf.float32,allow_missing=True), 'label': tf.FixedLenFeature([], tf.int64)}
@SajadNorouzi has an answer below that seems more correct. I've managed to get both methods to work. However, I'm not sure that the documentation is as clear as he states. Possibly it has been edited in the meantime, but it only seems to imply that FixedLenSequenceFeature should only be used for dimension 2 or higher. It may be worthwhile to edit this answer to mention the other answer, or to sort out which of the two methods is truly correct, or whether they both are. Cheers!
Had a similar issue due to the fact that my features were stored as lists of tf.strings, tf.float32 or tf.float64 so providing the respective feature description helped, e.g. "your key", tf.io.FixedLenSequenceFeature([], tf.string, allow_missing=True)
4

If your feature is a fixed 1-d array then using tf.FixedLenSequenceFeature is not correct at all. As the documentation mentioned, the tf.FixedLenSequenceFeature is for a input data with dimension 2 and higher. In this example you need to flatten your price array to become (288,) and then for decoding part you need to mention the array dimension.

Encode:

example = tf.train.Example(features=tf.train.Features(feature={
                                       'prices': _floats_feature(prices.tolist()),
                                       'label': _int64_feature(label[0]),
                                       'pip': _floats_feature(pip)

Decode:

keys_to_features = {'prices': tf.FixedLenFeature([288], tf.float32),
                'label': tf.FixedLenFeature([], tf.int64)}

Comments

1

You can't store an n-dimensional array as a float feature as float features are simple lists. You have to flatten prices into a list by doing prices.tolist(). If you need to recover the n-dimensional array from the flattened float feature, then you can do prices = np.reshape(float_feature, original_shape).

1 Comment

It still doesn't work with a flattened list. I still got the error above.
0

I had the same issue while carelessly modifying some scripts, it was caused by slightly different data shape. I had to change the shape to match expected shape, eg (A, B) to (1, A, B). I used np.ravel() for flattening.

Comments

0

Exactly the same thing happens to me with reading float32 data lists from TFrecord files.

I get Can't parse serialized Example when executing sess.run([time_tensor, frequency_tensor, frequency_weight_tensor]) with tf.FixedLenFeature, though tf.FixedLenSequenceFeature seems to be working fine.

My feature format for reading files (the working one) is as follows: feature_format = { 'time': tf.FixedLenSequenceFeature([], tf.float32, allow_missing = True), 'frequencies': tf.FixedLenSequenceFeature([], tf.float32, allow_missing = True), 'frequency_weights': tf.FixedLenSequenceFeature([], tf.float32, allow_missing = True) }

The encoding part is:

feature = { 'time': tf.train.Feature(float_list=tf.train.FloatList(value=[*some single value*]) ), 'frequencies': tf.train.Feature(float_list=tf.train.FloatList(value=*some_list*) ), 'frequency_weights': tf.train.Feature(float_list=tf.train.FloatList(value=*some_list*) ) }

This happens with TensorFlow 1.12 on Debian machine without GPU offloading (i.e. only CPU used with TensorFlow)

Is there any misuse from my side? Or is it a bug in the code or documentation? I can think on contributing/upstreaming any fixes if that would benefit anyone...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.