1

My input looks like this:

import datetime as dt
import pandas as pd

some_money = [34,42,300,450,550]
df = pd.DataFrame({'TIME': ['2020-01', '2019-12', '2019-11', '2019-10', '2019-09'], \
                    'MONEY':some_money}) 
df

Producing the following:

enter image description here

I want to add 3 more columns, getting the MONEY value for the previous month, like this (color coding for illustrative purposes):

enter image description here

This is what I have tried:

prev_period_money = ["m-1", "m-2", "m-3"]
for m in prev_period_money:
    df[m] = df["MONEY"] - 10 #well, it "works", but it gives df["MONEY"]- 10...

The TIME column is sorted, so one should not care about it. (But it would be great, if someone shows the "magic", being able to get data from it.)

4 Answers 4

2

Use for pandas 0.24+ fill_value=0 in Series.shift, then also are correct integers columns:

for x in range(1,4):
    df[f"m-{x}"] = df["MONEY"].shift(periods=-x, fill_value=0)

print (df)
      TIME  MONEY  m-1  m-2  m-3
0  2020-01     34   42  300  450
1  2019-12     42  300  450  550
2  2019-11    300  450  550    0
3  2019-10    450  550    0    0
4  2019-09    550    0    0    0

For pandas below 0.24 is necessary replace mising values and convert to integers:

for x in range(1,4):
    df[f"m-{x}"] = df["MONEY"].shift(periods=-x).fillna(0).astype(int)
Sign up to request clarification or add additional context in comments.

2 Comments

much shorter syntax.
The fillna(0) per column is actually quite important, otherwise the df = df.fillna(0) may "pollute" the whole table.
1

It is quite easy if you use shift

That would give you the desired output:

df["m-1"] = df["MONEY"].shift(periods=-1)
df["m-2"] = df["MONEY"].shift(periods=-2)
df["m-3"] = df["MONEY"].shift(periods=-3)
df = df.fillna(0)

This would work only if it's ordered. Otherwise you have to order it before.

2 Comments

If you knew that "shift" exists... I was expecting "Offset" as a term. :) Thanks :)
@Vityata that's true. Knowledge is power. Btw instead of three lines you can also integrate that in your for loop.
1

My suggestion: Use a list comprehension with the shift function to get your three columns, concat them on columns, and concatenate it again to the original dataframe

(pd.concat([df,pd.concat([df.MONEY.shift(-i) for i in 
                         range(1,4)],axis=1)],
           axis=1)
  .fillna(0)
 )


    TIME    MONEY   MONEY   MONEY   MONEY
0   2020-01 34  42.0    300.0   450.0
1   2019-12 42  300.0   450.0   550.0
2   2019-11 300 450.0   550.0   0.0
3   2019-10 450 550.0   0.0 0.0
4   2019-09 550 0.0 0.0 0.0

Comments

1
import pandas as pd

columns = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov"]
some_money = [34,42,300,450,550]

df = pd.DataFrame({'TIME': ['2020-01', '2019-12', '2019-11', '2019-10', '2019-09'], 'MONEY':some_money})

prev_period_money = ["m-1", "m-2", "m-3"]
count = 1
for m in prev_period_money:
    df[m] = df['MONEY'].iloc[count:].reset_index(drop=True)
    count += 1

df = df.fillna(0)

Output:

      TIME  MONEY    m-1    m-2    m-3
0  2020-01     34   42.0  300.0  450.0
1  2019-12     42  300.0  450.0  550.0
2  2019-11    300  450.0  550.0    0.0
3  2019-10    450  550.0    0.0    0.0
4  2019-09    550    0.0    0.0    0.0

2 Comments

What is the idea of the dropping and the reseting of the index?
@Vityata iloc is collecting rows with raw indexes

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.