Convert bytes to string implicitly in Pandas dataframe (Python 3.6)

Question

import pandas as pd
#Define Dataframe

d = {'cola': ['cola1', 'cola2', 'cola3', 'cola4', 'cola4']
    , 'colb': [b'colb1', b'colb2', b'colb3', b'colb4', b'colb4']
    , 'colc': ['colc1', 'colc2', 'colc3', 'colc4', 'colc4']
    , 'cold': [b'cold1', b'cold2', b'cold3', b'cold4', b'cold4']
  }
df = pd.DataFrame(data=d)

#Create flatfile from dataframe
df.to_csv('converted_file.txt', sep='|',index=False)

I would like to convert the bytes to strings i.e. remove 'b' prefix before creating the output file.

I tried the solution mentioned here: How to translate "bytes" objects into literal strings in pandas Dataframe, Python3.x?

str_df = df.select_dtypes([np.object])
str_df = str_df.stack().str.decode('utf-8').unstack()
for col in str_df:
    df[col] = str_df[col]

Although it works for columns [colb] and [cold], columns [cola] and [colc] are blank. This is mainly because all 4 columns are of type object. I am not sure how to select only columns [colb] and [cold] implicitly and then apply the decode function. These two columns need to be selected implicitly for decoding as the dataframe is created from an output of a SQL query.

Has anyone encountered this before and can suggest a solution?

Thanks in advance!

MaxU - stand with Ukraine · Accepted Answer · 2018-02-06 22:36:19Z

11

Demo:

In [12]: df
Out[12]:
    cola      colb   colc      cold
0  cola1  b'colb1'  colc1  b'cold1'
1  cola2  b'colb2'  colc2  b'cold2'
2  cola3  b'colb3'  colc3  b'cold3'
3  cola4  b'colb4'  colc4  b'cold4'
4  cola4  b'colb4'  colc4  b'cold4'

In [13]: df.applymap(lambda x: x.decode() if isinstance(x, bytes) else x)
Out[13]:
    cola   colb   colc   cold
0  cola1  colb1  colc1  cold1
1  cola2  colb2  colc2  cold2
2  cola3  colb3  colc3  cold3
3  cola4  colb4  colc4  cold4
4  cola4  colb4  colc4  cold4

answered Feb 6, 2018 at 22:36

MaxU - stand with Ukraine

212k37 gold badges402 silver badges437 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

KSN_3410 Over a year ago

Thank you @MaxU for your response! But, it doesn't work for me. I understand what the code is trying to do, and the isinstance function works when it is outside applymap function. Do I need to import any library for applymap to work?

KSN_3410 Over a year ago

Thanks again for your help @MaxU. This worked when I did: df = df.applymap(lambda x: x.decode() if isinstance(x, bytes) else x)

Collectives™ on Stack Overflow

Convert bytes to string implicitly in Pandas dataframe (Python 3.6)

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related