I have converted a kdb query into a dataframe and then uploaded that dataframe to a csv file. This caused an encoding error which I easily fixed by decoding to utf-8. However, there is one column which this did not work for.
"nameFid" is the column which isn't working correctly, it outputs on the CSV file as " b'STRING' "
I am running Python 3.7, any other information needed I will be happy to provide.
Here is my code which decodes the data in the dataframe I get from kdb
for ba in df.dtypes.keys():
if df.dtypes[ba] == 'O':
try:
df[ba] = df[ba].apply(lambda x: x.decode('UTF-8'))
except Exception as e:
print(e)
return df
This worked for every column except "nameFid"
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdc in position 6: invalid continuation byte -
This is one error I get but I thought this suggests that the data isn't encoded using UTF-8, which would surely mean all the columns wouldn't work?
When using the try except, it instead prints "'Series' object has no attribute 'decode'".
My goal is to remove the "b''" from the column values, which currently show
" b'STRING' "
I'm not sure what else i need to add. Let me know if you need anything.
Also sorry I am quite new to all of this.
chardetpackage.