2

I saved a pandas dataframe as a csv using

df_to_save.to_csv(save_file_path)

but when I read it back in using

df_temp = pd.read_csv(file_path)

I get an error message saying

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbf in position 158: invalid start byte

I've tried forcing the encoding on reading it to be utf-8 by opening the csv file with

df_temp = pd.read_csv(file_path, index_col=False, encoding="utf-8",sep=',') 

Really stuck, can anyone help?

Many thanks

2
  • Can you post raw text data or a link to your actual csv, the encoding may not be what you think it is Commented Dec 9, 2016 at 10:17
  • This question is similar to: Encoding error when reading csv file containing pandas dataframe. If you believe it’s different, please edit the question, make it clear how it’s different and/or how the answers on that question are not helpful for your problem. Commented Sep 24 at 11:16

3 Answers 3

4

Change the encoding of your categorical data :

def my_func(df):
    for col in df.columns:
        df[col] = df[col].str.decode('iso-8859-1').str.encode('utf-8')

This function will change in-place the encoding of your categorical data.

Sign up to request clarification or add additional context in comments.

Comments

3

That character is not encoded in UTF-8.

You can reproduce it with (docs):

b'\xbf'.decode("utf-8", "strict")
Traceback (most recent call last):

  File "<ipython-input-7-4db5a43b4577>", line 1, in <module>
    b'\xbf'.decode("utf-8", "strict")

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbf in position 0: invalid start byte

You can try a different encoding, that would solve the problem for this character:

b'\xbf'.decode("ISO-8859-1", "strict")
Out: '¿'

So your read_csv would change to:

df_temp = pd.read_csv(file_path, index_col=False, encoding="ISO-8859-1") 

Comments

0

OR TO AVOID ENCODING PROBLEMS USE EXCEL (also return DataFrames)

writer = pd.ExcelWriter('train_numeric.xlsx')
newTRAIN.to_excel(writer,'Sheet1')

THEN

newTEST_excel = pd.read_excel('train_numeric.xlsx')
newTEST_excel.head(2)

1 Comment

Tip... stop the textual "YELLING" ;-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.