4
import pandas as pd
#Define Dataframe

d = {'cola': ['cola1', 'cola2', 'cola3', 'cola4', 'cola4']
    , 'colb': [b'colb1', b'colb2', b'colb3', b'colb4', b'colb4']
    , 'colc': ['colc1', 'colc2', 'colc3', 'colc4', 'colc4']
    , 'cold': [b'cold1', b'cold2', b'cold3', b'cold4', b'cold4']
  }
df = pd.DataFrame(data=d)

#Create flatfile from dataframe
df.to_csv('converted_file.txt', sep='|',index=False)

I would like to convert the bytes to strings i.e. remove 'b' prefix before creating the output file.

I tried the solution mentioned here: How to translate "bytes" objects into literal strings in pandas Dataframe, Python3.x?

str_df = df.select_dtypes([np.object])
str_df = str_df.stack().str.decode('utf-8').unstack()
for col in str_df:
    df[col] = str_df[col]

Although it works for columns [colb] and [cold], columns [cola] and [colc] are blank. This is mainly because all 4 columns are of type object. I am not sure how to select only columns [colb] and [cold] implicitly and then apply the decode function. These two columns need to be selected implicitly for decoding as the dataframe is created from an output of a SQL query.

Has anyone encountered this before and can suggest a solution?

Thanks in advance!

1 Answer 1

11

Demo:

In [12]: df
Out[12]:
    cola      colb   colc      cold
0  cola1  b'colb1'  colc1  b'cold1'
1  cola2  b'colb2'  colc2  b'cold2'
2  cola3  b'colb3'  colc3  b'cold3'
3  cola4  b'colb4'  colc4  b'cold4'
4  cola4  b'colb4'  colc4  b'cold4'

In [13]: df.applymap(lambda x: x.decode() if isinstance(x, bytes) else x)
Out[13]:
    cola   colb   colc   cold
0  cola1  colb1  colc1  cold1
1  cola2  colb2  colc2  cold2
2  cola3  colb3  colc3  cold3
3  cola4  colb4  colc4  cold4
4  cola4  colb4  colc4  cold4
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you @MaxU for your response! But, it doesn't work for me. I understand what the code is trying to do, and the isinstance function works when it is outside applymap function. Do I need to import any library for applymap to work?
Thanks again for your help @MaxU. This worked when I did: df = df.applymap(lambda x: x.decode() if isinstance(x, bytes) else x)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.