0

I have this code below unique_allocations is a DataFrame of 4 rows that has different numbers let's say [1,2,3,4]. Then as we can see for every row of filered_processed_full I'm duplicated the row 4 times then trying to set allocation_columns to unique_allocations. So for example if curr_df = [[taco, allocation], [0 , 1], [0 , 1], [0 , 1], [0 , 1]] (we'll just pretend that the first row is column names and subsequent rows are values for row). I'd like to transform that into curr_df = [[taco, allocation], [0 , 1], [0 , 2], [0 , 3], [0 , 4]], where in this example allocation_columns is allocation. How do I set it this way? Currently the print() is just printing curr_df wit the columns I wanted unchanged.

import pandas as pd

for idx, row in filtered_processed_full.iterrows():
    curr_df = pd.DataFrame()
    for i in range (4):
        curr_df = curr_df.append(row)
    curr_df[allocation_columns] = unique_allocations
    with pd.option_context('display.max_rows', None,
                       'display.max_columns', None,
                       'display.precision', 3,
                       ):             
        print(curr_df[allocation_columns])
1
  • We may need a minimal reproducible example with some data for more than one single row of filtered_processed_full. In the actual version of your code, you define curr_df inside the outer for loop, which means that in each iteration, the previous one is going to be overwritten. Is this your intention or a mistake? a more complete example of your output will help. Commented Aug 15, 2023 at 3:04

2 Answers 2

1

Replace DataFrame column with new values from unique_allocations.

import pandas as pd

data = [1,2,3,4]
unique_allocations = pd.DataFrame(data=data, columns=['allo_new'])

for idx, row in filtered_processed_full.iterrows():
    curr_df = pd.DataFrame()
    for i in range (4):
        curr_df = curr_df.append(row)
    # replace column with new values
    curr_df[allocation_columns] = unique_allocations['allo_new']
    with pd.option_context('display.max_rows', None,
                       'display.max_columns', None,
                       'display.precision', 3,
                       ):             
        print(curr_df[allocation_columns])

Testing code and output:

import pandas as pd
data = [['taco', 'allocation'], [0 , 1], [0 , 1], [0 , 1], [0 , 1]]
df = pd.DataFrame(data=data[1:], columns=data[0])
print(df)
data2 = [1,2,3,4]
df2 = pd.DataFrame(data=data2, columns=['allo_new'])
df['allocation'] = df2['allo_new']
print(df)

Output

   taco  allocation
0     0           1
1     0           1
2     0           1
3     0           1
   taco  allocation
0     0           1
1     0           2
2     0           3
3     0           4
Sign up to request clarification or add additional context in comments.

Comments

1

If I understand correctly, you are trying to add tiles of unique values to each value of the original DataFrame.

If you assign curr_df = pd.DataFrame() inside the outer for loop, the values of curr_df are going to be replaced after each iteration. You could store them somewhere else, but it's easier and probably faster to use either numpy or pandas built-in tools.

unique_allocations = pd.DataFrame([1, 2, 3, 4], columns=['unique_allocations'])
filered_processed_full = pd.DataFrame([0, 1, 2], columns=['taco'])

# repeat values in filered_processed_full as many times as unique allocations
reps = len(unique_allocations)
curr_df = filered_processed_full.loc[filered_processed_full.index.repeat(reps)]

# add tiles of unique allocations, one per each original value
number_of_tiles = len(filered_processed_full)
curr_df['allocation_columns'] = np.tile(unique_allocations.values.squeeze(), number_of_tiles)

print(curr_df)

# Outputs
   taco  allocation_columns
0     0                   1
0     0                   2
0     0                   3
0     0                   4
1     1                   1
1     1                   2
1     1                   3
1     1                   4
2     2                   1
2     2                   2
2     2                   3
2     2                   4

Here I repeat the values with an adaptation of this answer, but you could use numpy.repeat.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.