I have a dictionary of polars.DataFrames called data_dict.
All dataframes inside the dict values are having an extra index column ''.
I want to drop that column and set a new column named 'name_ID'
Code:
data_pl = pl.concat(data_dict.values()).with_row_index('name_ID')
Error:
polars.exceptions.DuplicateError: column with name 'name_ID' has more than one occurrence
- My columns:
['','name_ID','col1',....,'colN']
Tried methods:
data_pl.to_pandas().set_index('name_ID')
Due to memory problems, if I try to use pandas.set_index() command I don't have enough GiB to allocate for that command.
Please help with some alternatives for how to set index column with polars.DataFrame.
.drop('')will still leave you with a DuplicateError because you have an existingname_IDcolumn and are trying to add another one usingwith_row_index('name_ID'). This is why a runnable example is required. The User Guide explains "no index": docs.pola.rs/user-guide/migration/pandas/…to_pandas()was performed taking the first column which happened to be name_ID, a bit of a workaround but there was no need to usewith_row_index.