Pandas - Set selected rows' columns to other DataFrame's rows' column values

Question

I have this code below unique_allocations is a DataFrame of 4 rows that has different numbers let's say [1,2,3,4]. Then as we can see for every row of filered_processed_full I'm duplicated the row 4 times then trying to set allocation_columns to unique_allocations. So for example if curr_df = [[taco, allocation], [0 , 1], [0 , 1], [0 , 1], [0 , 1]] (we'll just pretend that the first row is column names and subsequent rows are values for row). I'd like to transform that into curr_df = [[taco, allocation], [0 , 1], [0 , 2], [0 , 3], [0 , 4]], where in this example allocation_columns is allocation. How do I set it this way? Currently the print() is just printing curr_df wit the columns I wanted unchanged.

import pandas as pd

for idx, row in filtered_processed_full.iterrows():
    curr_df = pd.DataFrame()
    for i in range (4):
        curr_df = curr_df.append(row)
    curr_df[allocation_columns] = unique_allocations
    with pd.option_context('display.max_rows', None,
                       'display.max_columns', None,
                       'display.precision', 3,
                       ):             
        print(curr_df[allocation_columns])

We may need a minimal reproducible example with some data for more than one single row of filtered_processed_full. In the actual version of your code, you define curr_df inside the outer for loop, which means that in each iteration, the previous one is going to be overwritten. Is this your intention or a mistake? a more complete example of your output will help. — Ignatius Reilly
– Ignatius Reilly, Commented Aug 15, 2023 at 3:04

taller · Accepted Answer · 2023-08-15 01:28:59Z

Replace DataFrame column with new values from unique_allocations.

import pandas as pd

data = [1,2,3,4]
unique_allocations = pd.DataFrame(data=data, columns=['allo_new'])

for idx, row in filtered_processed_full.iterrows():
    curr_df = pd.DataFrame()
    for i in range (4):
        curr_df = curr_df.append(row)
    # replace column with new values
    curr_df[allocation_columns] = unique_allocations['allo_new']
    with pd.option_context('display.max_rows', None,
                       'display.max_columns', None,
                       'display.precision', 3,
                       ):             
        print(curr_df[allocation_columns])

Testing code and output:

import pandas as pd
data = [['taco', 'allocation'], [0 , 1], [0 , 1], [0 , 1], [0 , 1]]
df = pd.DataFrame(data=data[1:], columns=data[0])
print(df)
data2 = [1,2,3,4]
df2 = pd.DataFrame(data=data2, columns=['allo_new'])
df['allocation'] = df2['allo_new']
print(df)

Output

   taco  allocation
0     0           1
1     0           1
2     0           1
3     0           1
   taco  allocation
0     0           1
1     0           2
2     0           3
3     0           4

Ignatius Reilly · Accepted Answer · 2023-08-15 03:34:50Z

If I understand correctly, you are trying to add tiles of unique values to each value of the original DataFrame.

If you assign curr_df = pd.DataFrame() inside the outer for loop, the values of curr_df are going to be replaced after each iteration. You could store them somewhere else, but it's easier and probably faster to use either numpy or pandas built-in tools.

unique_allocations = pd.DataFrame([1, 2, 3, 4], columns=['unique_allocations'])
filered_processed_full = pd.DataFrame([0, 1, 2], columns=['taco'])

# repeat values in filered_processed_full as many times as unique allocations
reps = len(unique_allocations)
curr_df = filered_processed_full.loc[filered_processed_full.index.repeat(reps)]

# add tiles of unique allocations, one per each original value
number_of_tiles = len(filered_processed_full)
curr_df['allocation_columns'] = np.tile(unique_allocations.values.squeeze(), number_of_tiles)

print(curr_df)

# Outputs
   taco  allocation_columns
0     0                   1
0     0                   2
0     0                   3
0     0                   4
1     1                   1
1     1                   2
1     1                   3
1     1                   4
2     2                   1
2     2                   2
2     2                   3
2     2                   4

Here I repeat the values with an adaptation of this answer, but you could use numpy.repeat.

Collectives™ on Stack Overflow

Pandas - Set selected rows' columns to other DataFrame's rows' column values

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related