2

I try to calculate number of days until and since last and next holiday. My method of calculation it is like below:

holidays = pd.Series(pd.to_datetime(["01.01.2013", "06.01.2013", "14.02.2013","29.03.2013",
                                    "31.03.2013", "01.04.2013", "01.05.2013", "03.05.2013",
                                    "19.05.2013", "26.05.2013", "30.05.2013", "23.06.2013",
                                    "15.07.2013", "27.10.2013", "01.11.2013", "11.11.2013",
                                    "24.12.2013", "25.12.2013", "26.12.2013", "31.12.2013",
                                            
                                    "01.01.2014", "06.01.2014", "14.02.2014", "30.03.2014",
                                    "18.04.2014", "20.04.2014", "21.04.2014", "01.05.2014",
                                    "03.05.2014", "03.05.2014", "26.05.2014", "08.06.2014",
                                    "19.06.2014", "23.06.2014", "15.08.2014", "26.10.2014",
                                    "01.11.2014", "11.11.2014", "24.12.2014", "25.12.2014",
                                    "26.12.2014", "31.12.2014",
                                            
                                    "01.01.2015", "06.01.2015", "14.02.2015", "29.03.2015",
                                    "03.04.2015", "05.04.2015", "06.04.2015", "01.05.2015",
                                    "03.05.2015", "24.05.2015", "26.05.2015", "04.06.2015",
                                    "23.06.2015", "15.08.2015", "25.10.2015", "01.11.2015",
                                    "11.11.2015", "24.12.2015", "25.12.2015", "26.12.2015",
                                    "31.12.2015"], dayfirst=True))

#Number of days until next holiday
d_until_next_holiday = []
#Number of days since last holiday
d_since_last_holiday = []

for row in data.itertuples():

    next_special_date = holidays[holidays >= row["Date"]].iloc[0]
    d_until_next_holiday.append((next_special_date - row["Date"])/pd.Timedelta('1D'))

    previous_special_date = holidays[holidays <= row.index].iloc[-1]
    d_since_last_holiday.append((row["Date"] - previous_special_date)/pd.Timedelta('1D'))

#Add new cols to DF
sto2STG14["d_until_next_holiday"] = d_until_next_holiday
sto2STG14["d_since_last_holiday"] = d_since_last_holiday

Nevertheless, I have en error like below:

TypeError: tuple indices must be integers or slices, not str

enter image description here

Why I have this erro ? I know that row is tuple, but i use in my code .iloc[0] and .iloc[-1] ? WHat can I do ?

3
  • Can you state your data? Also what is sto2STG14 ? Beside these questions, I thing you should use for row in data.iterrows() and row[1]['Date'] or for i, row in data.row['Date'] as it returns a tuple that does not allow slicing by string. Commented Feb 26, 2021 at 8:24
  • Why convert the list to a pandas object if you're going to iterate through it as tuples? Commented Feb 26, 2021 at 8:33
  • He is not using the tuples but only wants to access the date. That's why I asked what data is. Commented Feb 26, 2021 at 8:34

1 Answer 1

2

With pandas, you rarely need to loop. In this case, the .shift method allows you to compute everything in one go:

import pandas
holidays = pandas.Series(pandas.to_datetime([
        "01.01.2013", "06.01.2013", "14.02.2013","29.03.2013",
        "31.03.2013", "01.04.2013", "01.05.2013", "03.05.2013",
        "19.05.2013", "26.05.2013", "30.05.2013", "23.06.2013",
        "15.07.2013", "27.10.2013", "01.11.2013", "11.11.2013",
        "24.12.2013", "25.12.2013", "26.12.2013", "31.12.2013",
        "01.01.2014", "06.01.2014", "14.02.2014", "30.03.2014",
        "18.04.2014", "20.04.2014", "21.04.2014", "01.05.2014",
        "03.05.2014", "03.05.2014", "26.05.2014", "08.06.2014",
        "19.06.2014", "23.06.2014", "15.08.2014", "26.10.2014",
        "01.11.2014", "11.11.2014", "24.12.2014", "25.12.2014",
        "26.12.2014", "31.12.2014",
        "01.01.2015", "06.01.2015", "14.02.2015", "29.03.2015",
        "03.04.2015", "05.04.2015", "06.04.2015", "01.05.2015",
        "03.05.2015", "24.05.2015", "26.05.2015", "04.06.2015",
        "23.06.2015", "15.08.2015", "25.10.2015", "01.11.2015",
        "11.11.2015", "24.12.2015", "25.12.2015", "26.12.2015",
        "31.12.2015"
    ], dayfirst=True)
)

results = (
    holidays
    .sort_values()
    .to_frame('holiday')
    .assign(
        days_since_prev=lambda df: df['holiday'] - df['holiday'].shift(1),
        days_until_next=lambda df: df['holiday'].shift(-1) - df['holiday'],
    )
)

results.head(10)

And I get:

     holiday days_since_prev days_until_next
0 2013-01-01             NaT          5 days
1 2013-01-06          5 days         39 days
2 2013-02-14         39 days         43 days
3 2013-03-29         43 days          2 days
4 2013-03-31          2 days          1 days
5 2013-04-01          1 days         30 days
6 2013-05-01         30 days          2 days
7 2013-05-03          2 days         16 days
8 2013-05-19         16 days          7 days
9 2013-05-26          7 days          4 days
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.