1

I am trying to make a loop that will output into multiple dataframes from one large dataframe.

raw_df['names'] = [joe, joe, bob, john, john]
raw_df['order_id'] = [10, 12, 5, 20, 25]
raw_df['amount'] = [100, 1000, 200, 20 25]

for name in raw_df['name'].unique():
    names = pd.DataFrame(raw_df.loc[raw_df['name'] == name])
    name['cummulative_sum'] = owner_names['amount'].cumsum()

Expected outcome for all names: joe.head()

name   id   sum
joe    10   100    
joe    12   110

2 Answers 2

2

Instead of checking for each unique item, it's possible to do .groupby on the variable of interest:

for group_name, group_df in raw_df.groupby("name"):
   print("Processing name:", group_name)
   names = group_df # this is the same as "names" in your snippet
   names["cum_sum"] = names["amount"].cumsum()

The group_df is the same df one would get with raw_df.loc[raw_df['name'] == name].

Sign up to request clarification or add additional context in comments.

Comments

0

You could do

variables = locals()

for name, data in raw_df.groupby('names'):
    variables[name] = data
 
joe
Out[607]: 
  names  order_id  amount
0   joe        10     100
1   joe        12    1000

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.