make a shift by index with a pandas dataframe

Question

Is there a pandas way to do that:

predicted_sells = []
for row in df.values:
    index_tms = row[0]
    delta = index_tms + timedelta(hours=1)
    try:
        sells_to_predict = df.loc[delta]['cars_sold']
    except KeyError:
        new_element = None
    predicted_sells.append(sells_to_predict)


df['sell_to_predict'] = predicted_sells

example explanation:

sell is the number of cars I sold at the time tms. sell_to_predict is the number of cars I sold the hour after. I want to predict that. So I want to build a new column containing at the time tms the number of cars I will sell at the time tms+1h

before my code it looks like that

                tms  sell 
2015-11-23 15:00:00     6               
2015-11-23 16:00:00     2               
2015-11-23 17:00:00    10

after it looks like that

                tms  sell  sell_to_predict
2015-11-23 15:00:00     6                2
2015-11-23 16:00:00     2               10
2015-11-23 17:00:00    10              NaN

I create a new column based on a shift of an other column, but that's not a shift in number of columns. That's a shift based on an index (here the index is a timestamp)

Here is an other example, little more complex :

before :

            sell  random
store hour              
1     1        1       9
      2        7       7
2     1        4       3
      2        2       3

after :

            sell  random  predict
store hour              
1     1        1       9        7
      2        7       7      NaN
2     1        4       3        2
      2        2       3      NaN

Can you provide a small example of the dataframe you would like to modify, and an example of what you are hoping to get out? From the example you provided it is unclear what index_tms and ['old_column] actually represent. For instance why would the following not work? df['new_column'] = df.index + timedelta(hours=1) — johnchase
– johnchase, Commented Nov 23, 2015 at 17:03
Imagine I want to predict the number of cars I will sell in one hour. I have in 'old_column' the number of cars I sold at the time I am using as an index. Then I want to for that precise time the number of cars sold one hour later, thus I want to create a 'new_column' containing the number of cars sold, but one hour later. I will edit my question in order to illustrate that. — Borbag
– Borbag, Commented Nov 23, 2015 at 17:11

acushner · Accepted Answer · 2015-11-23 19:07:14Z

2

have you tried shift?

e.g.

df = pd.DataFrame(list(range(4)))
df.columns = ['sold']
df['predict'] = df.sold.shift(-1)

df
   sold  predict
0     0        1
1     1        2
2     2        3
3     3      NaN

answered Nov 23, 2015 at 19:07

acushner

9,9461 gold badge38 silver badges37 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Borbag Over a year ago

this is not what I want, since it's not based on index comparison. the line before is not always one with an index one hour before. Moreover, I didn't say that in my question tho, I have one more index (say here the id of the car store), so the line before can reference a sell that happened in an other store.

Community · Accepted Answer · 2017-05-23 12:23:44Z

2

the answer was to resample so I won't have any hole, and then apply the answer for this question : How do you shift Pandas DataFrame with a multiindex?

edited May 23, 2017 at 12:23

CommunityBot

11 silver badge

answered Nov 24, 2015 at 10:00

Borbag

6271 gold badge6 silver badges26 bronze badges

Collectives™ on Stack Overflow

make a shift by index with a pandas dataframe

2 Answers 2

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related