I am trying to do 2 things in Python:
- Select the names of specific columns using a
regex - Rename these selected columns using a list of names (the names are unfortunately stored in their own weird dataframe)
I am new to python and pandas but did a bunch of googling and am getting the TypeError: Index does not support mutable operations error. Here's what I am doing.
import pandas as pd
import numpy as np
df=pd.DataFrame(data=np.array([
[1, 3, 3, 4, 5,9,5],
[1, 2, 4, 4, 5,8,4],
[1, 2, 3, 'a', 5,7,3],
[1, 2, 3, 4, 'e',6,2],
['f', 2, 3, 4, 5,6,1]
]),
columns=[
'a',
'car-b',
'car-c',
'car-d',
'car-e',
'car-f',
'car-g'])
#Select the NAMES of the columns that contain 'car' in them as I want to change these column names
names_to_change = df.columns[df.columns.str.contains("car")]
names_to_change
#Here is the dataset that has the names that I want to use to replace these
#This is just how the names are stored in the workflow
new_names=pd.DataFrame(data=np.array([
['new_1','new_3','new_5'],
['new_2','new_4','new_6']
]))
new_names
#My approach is to transform the new names into a list
new_names_list=pd.melt(new_names).iloc[:,1].tolist()
new_names_list
#Now I figure I would use .columns to do the replacement
#But this returnts the mutability error
df.columns[df.columns.str.contains("car")]=new_names_list
#This also returns the same error
df.columns = df.columns[df.columns.str.contains("car")].tolist()+new_names_list
Traceback (most recent call last):
File "C:\Users\zsg876\AppData\Local\Temp/ipykernel_1340/261138782.py", line 44, in <module>
df.columns[df.columns.str.contains("car")]=new_names_list
File "C:\Users\zsg876\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 4585, in __setitem__
raise TypeError("Index does not support mutable operations")
TypeError: Index does not support mutable operations
I tried a bunch of different methods (this was no help: how to rename columns in pandas using a list) and haven't had much luck. I am coming over from R where renaming columns was a lot simpler -- you'd just pass a vector using names().
I take it the workflow is different here? Appreciate any suggestions!
UPDATE:
This seems to do the trick, but I am not sure why exactly. I figured replacing one list with another of equal length would work, but that does not seem to be the case. Can anyone educate me here?
col_rename_dict=dict(zip(names_to_change,new_names_list))
df.rename(columns=col_rename_dict, inplace=True)