I have a tsv file which in some lines a particular column contains mixed formats such as: Hapoel_Be\u0027er_Sheva_A\u002eF\u002eC\u002e which should be Hapoel_Be'er_Sheva_A.F.C..
And here is the code I use to read the file and split the columns:
with open(path, 'rb') as f:
for line in f:
cols = line.decode('utf-8').split('\t')
text = cols[3].decode('unicode-escape') #Here is the column that has the above mentioned mixed format
Error message:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0160' in position 6: ordinal not in range(128)
I would like to know how to convert from the first mixed format to the other while reading the file? I'm using python 2.7.
Thank you so much,