I get a pandas DataFrame as follows and want to convert it to torch.tensor for embedding.
# output first 5 rows examples
print(df['col'].head(5))
col
0 [a, bc, cd]
1 [d, ed, fsd, g, h]
2 [i, hh, ihj, gfw, hah]
3 [a, cb]
4 [sad]
train_tensor = torch.from_numpy(train)
But it gets an error:
TypeError: can't convert np.ndarray of type numpy.str_. The only supported types are: float64, float32, float16, int64, int32, int16, int8, uint8, and bool.
It seems that from_numpy() doesn't support the variable lenght sequences.
So if want to initialize tensor form it what is the proper way?
And after getting the corresponding tensor I will try to add padding to variable length sequences and do embedding layer for it.
Could anyone help me?
Thanks in advances.
train? And what are those 5 literal arrays? Can we get a more precise code snippets?