4

I have a dataframe of stock prices:

df = pd.DataFrame([100, 101, 99, 100,105,104,106], columns=['P'])

I would like to create a counter column, that counts either if the current price is higher than the previous row's price, BUT if the current price is lower than the previous row's price, only counts again, once that price is exceeded (like a watermark). Below is the desired column:

df['counter'] = [np.nan, 1, 1, 1,2,2,3]

So the second row's price is 101 which exceeds 100, so the counter is 1, then the price drops to 99 and comes back to 100, but the counter is still 1, because we have not reached the 101 price (which is the watermark), then once we exceed 101 in row 4, with a price of 105, the counter goes to 2, then the price drops to 104 again, so we stay at 2, and then when it goes to 106 we increase the counter to 3.

3 Answers 3

3

Algorithm:

  1. Find what current maximum previously observed value was at each row (inclusive of the current row).

  2. See what the maximum previously observed value was for the preceding row.

  3. Each time a difference exists between these two values, we know that a new water mark has been hit within the current row.

  4. Calculate the cumulative sum of the number of times a new water mark has been hit.

     df["current_observed_max"] = df["p"].cummax()
     df["previous_observed_max"] = df["current_observed_max"].shift(1)
     df["is_new_watermark"] =(df["current_observed_max"] != df["previous_observed_max"]).astype(int)
     df["counter"] = df["is_new_watermark"].cumsum()
    

With this you may need to subtract 1 depending on how you would like to handle the first observed number.

Sign up to request clarification or add additional context in comments.

1 Comment

Very nice, that works perfectly. Yeah, I can just subtract 1, that works.
3

A very simple and efficient method is to combine pandas.factorize and cummax:

df['counter'] = pd.factorize(df['P'].cummax())[0]

Output:

     P  counter
0  100        0
1  101        1
2   99        1
3  100        1
4  105        2
5  104        2
6  106        3

1 Comment

That is very elegant, thank you for suggesting this!!!
2

Another way: Find if the row value is equal to the cummulative maximum and cumsum() to create unique groups

df['newP']=(df['P'].cummax()==df['P']).cumsum()-1

     P  newP
0  100     0
1  101     1
2   99     1
3  100     1
4  105     2
5  104     2
6  106     3

1 Comment

Ah, also very nice. Thank you!!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.