Add prefix to specific columns of Dataframe

Question

I've a DataFrame like that :

col1   col2   col3   col4   col5   col6   col7   col8
0      5345   rrf    rrf    rrf    rrf    rrf    rrf
1      2527   erfr   erfr   erfr   erfr   erfr   erfr
2      2727   f      f      f      f      f      f

I would like to rename all columns but not col1 and col2.

So I tried to make a loop

print(df.columns)
    for col in df.columns:
        if col != 'col1' and col != 'col2':
            col.rename = str(col) + '_x'

But it's not very efficient...it doesn't work !

A.Kot · Accepted Answer · 2016-09-29 15:16:28Z

23

You can use the DataFrame.rename() method

new_names = [(i,i+'_x') for i in df.iloc[:, 2:].columns.values]
df.rename(columns = dict(new_names), inplace=True)

edited Sep 29, 2016 at 15:16

answered Sep 29, 2016 at 14:25

A.Kot

7,9932 gold badges24 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Peter Chen Over a year ago

Great solution! Additional comment: new_names = [(i,i+'_x') for i in df.columns if i not in ['col1','col2']]

jezrael · Accepted Answer · 2016-09-29 14:50:52Z

Simpliest solution if col1 and col2 are first and second column names:

df.columns = df.columns[:2].union(df.columns[2:]  + '_x')
print (df)
   col1  col2 col3_x col4_x col5_x col6_x col7_x col8_x
0     0  5345    rrf    rrf    rrf    rrf    rrf    rrf
1     1  2527   erfr   erfr   erfr   erfr   erfr   erfr
2     2  2727      f      f      f      f      f      f

Another solution with isin or list comprehension:

cols = df.columns[~df.columns.isin(['col1','col2'])]
print (cols)
['col3', 'col4', 'col5', 'col6', 'col7', 'col8']

df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)

print (df)

   col1  col2 col3_x col4_x col5_x col6_x col7_x col8_x
0     0  5345    rrf    rrf    rrf    rrf    rrf    rrf
1     1  2527   erfr   erfr   erfr   erfr   erfr   erfr
2     2  2727      f      f      f      f      f      f

cols = [col for col in df.columns if col not in ['col1', 'col2']]
print (cols)
['col3', 'col4', 'col5', 'col6', 'col7', 'col8']

df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)

print (df)

   col1  col2 col3_x col4_x col5_x col6_x col7_x col8_x
0     0  5345    rrf    rrf    rrf    rrf    rrf    rrf
1     1  2527   erfr   erfr   erfr   erfr   erfr   erfr
2     2  2727      f      f      f      f      f      f

The fastest is list comprehension:

df.columns = [col+'_x' if col != 'col1' and col != 'col2' else col for col in df.columns]

Timings:

In [350]: %timeit (akot(df))
1000 loops, best of 3: 387 µs per loop

In [351]: %timeit (jez(df1))
The slowest run took 4.12 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 207 µs per loop

In [363]: %timeit (jez3(df2))
The slowest run took 6.41 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 75.7 µs per loop

df1 = df.copy()
df2 = df.copy()

def jez(df):
    df.columns = df.columns[:2].union(df.columns[2:]  + '_x')
    return df

def akot(df):
    new_names = [(i,i+'_x') for i in df.iloc[:, 2:].columns.values]
    df.rename(columns = dict(new_names), inplace=True)
    return df


def jez3(df):
   df.columns = [col + '_x' if col != 'col1' and col != 'col2' else col for col in df.columns]
   return df


print (akot(df))
print (jez(df1))
print (jez2(df1))

EdChum · Accepted Answer · 2016-09-29 14:34:44Z

5

You can use str.contains with a regex pattern to filter the cols of interest, then using zip construct a dict and pass this as the arg to rename:

In [94]:
cols = df.columns[~df.columns.str.contains('col1|col2')]
df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)
df

Out[94]:
   col1  col2 col3_x col4_x col5_x col6_x col7_x col8_x
0     0  5345    rrf    rrf    rrf    rrf    rrf    rrf
1     1  2527   erfr   erfr   erfr   erfr   erfr   erfr
2     2  2727      f      f      f      f      f      f

So here using str.contains to filter the columns will return the columns that don't match so the column order is irrelevant

edited Sep 29, 2016 at 14:34

answered Sep 29, 2016 at 14:23

EdChum

397k204 gold badges836 silver badges583 bronze badges

2 Comments

Geo-x Over a year ago

Wahou ! It's perfect ! It is possible to use str.value = or a code like that ?

EdChum Over a year ago

Not sure what you're attempting with that code snippet, but generally you need to use rename or overwrite the columns attribute directly

Suhas_Pote · Accepted Answer · 2021-11-23 15:12:22Z

1

Quick solution that works for me

As EdChum suggested, I used str.contains and ~ to filter out the columns

cols = df.columns[~df.columns.str.contains('col1|col2')]

then used rename function of pandas

df.rename(columns={col: col + '_x' for col in df.columns if col in cols}, inplace=True)

P.S. df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True) did not work in my case.

answered Nov 23, 2021 at 15:12

Suhas_Pote

4,7302 gold badges28 silver badges41 bronze badges

Collectives™ on Stack Overflow

Add prefix to specific columns of Dataframe

4 Answers 4

1 Comment

1 Comment

2 Comments

Quick solution that works for me

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

1 Comment

2 Comments

Quick solution that works for me

Comments

Your Answer

Sign up or log in

Post as a guest

Related