I have been trying to get this to work and cannot find a solution. I have data that looks like this in dataframe (df):
index plant_name business_name power_kwh mos_time day month year
0 PROVIDENCE HEIGHTS UNITED STATES 7805.7 2023-02-25 08:00:00 56 2 2023
1 PROVIDENCE HEIGHTS UNITED STATES 9943.7 2023-02-25 07:00:00 56 2 2023
2 PROVIDENCE HEIGHTS UNITED STATES 9509.8 2023-02-25 06:00:00 56 2 2023
3 PROVIDENCE HEIGHTS UNITED STATES 8333 2023-02-25 05:00:00 56 2 2023
2993 PROVIDENCE HEIGHTS UNITED STATES 14560 2022-10-25 17:00:00 298 10 2022
2994 PROVIDENCE HEIGHTS UNITED STATES 9260.4 2022-10-25 16:00:00 298 10 2022
2995 PROVIDENCE HEIGHTS UNITED STATES 7327.7 2022-10-25 15:00:00 298 10 2022
2996 PROVIDENCE HEIGHTS UNITED STATES 5579.1 2022-10-25 14:00:00 298 10 2022
2997 PROVIDENCE HEIGHTS UNITED STATES 4507 2022-10-25 13:00:00 298 10 2022
13993 PROVIDENCE HEIGHTS UNITED STATES 1655.3 2021-07-19 14:00:00 200 7 2021
13994 PROVIDENCE HEIGHTS UNITED STATES 1686.1 2021-07-19 13:00:00 200 7 2021
13995 PROVIDENCE HEIGHTS UNITED STATES 2243.7 2021-07-19 12:00:00 200 7 2021
13996 PROVIDENCE HEIGHTS UNITED STATES 3577.9 2021-07-19 11:00:00 200 7 2021
33995 PROVIDENCE HEIGHTS UNITED STATES 2220.2 2019-04-05 20:00:00 95 4 2019
33996 PROVIDENCE HEIGHTS UNITED STATES 2266.7 2019-04-05 19:00:00 95 4 2019
33997 PROVIDENCE HEIGHTS UNITED STATES 2292.4 2019-04-05 18:00:00 95 4 2019
33998 PROVIDENCE HEIGHTS UNITED STATES 2197 2019-04-05 17:00:00 95 4 2019
The dict that I need to use to assign week numbers looks like this:
weeks = {
1: [1, 2, 3, 4, 5],
2: [6, 7, 8, 9],
3: [10, 11, 12, 13],
4: [14, 15, 16, 17],
5: [18, 19, 20, 21, 22],
6: [23, 24, 25, 26],
7: [27, 28, 29, 30, 31],
8: [32, 33, 34, 35],
9: [36, 37, 38, 39],
10: [40, 41, 42, 43, 44],
11: [45, 46, 47, 48],
12: [49, 50, 51, 52]
}
And, my answer looks like this with the added "week" column on the far right column. thank you for any help here.
index plant_name business_name power_kwh mos_time day month year week
0 PROVIDENCE HEIGHTS UNITED STATES 7805.7 2023-02-25 08:00:00 56 2 2023 8
1 PROVIDENCE HEIGHTS UNITED STATES 9943.7 2023-02-25 07:00:00 56 2 2023 8
2 PROVIDENCE HEIGHTS UNITED STATES 9509.8 2023-02-25 06:00:00 56 2 2023 8
3 PROVIDENCE HEIGHTS UNITED STATES 8333 2023-02-25 05:00:00 56 2 2023 8
2993 PROVIDENCE HEIGHTS UNITED STATES 14560 2022-10-25 17:00:00 298 10 2022 43
2994 PROVIDENCE HEIGHTS UNITED STATES 9260.4 2022-10-25 16:00:00 298 10 2022 43
2995 PROVIDENCE HEIGHTS UNITED STATES 7327.7 2022-10-25 15:00:00 298 10 2022 43
2996 PROVIDENCE HEIGHTS UNITED STATES 5579.1 2022-10-25 14:00:00 298 10 2022 43
2997 PROVIDENCE HEIGHTS UNITED STATES 4507 2022-10-25 13:00:00 298 10 2022 43
13993 PROVIDENCE HEIGHTS UNITED STATES 1655.3 2021-07-19 14:00:00 200 7 2021 30
13994 PROVIDENCE HEIGHTS UNITED STATES 1686.1 2021-07-19 13:00:00 200 7 2021 30
13995 PROVIDENCE HEIGHTS UNITED STATES 2243.7 2021-07-19 12:00:00 200 7 2021 30
13996 PROVIDENCE HEIGHTS UNITED STATES 3577.9 2021-07-19 11:00:00 200 7 2021 30
33995 PROVIDENCE HEIGHTS UNITED STATES 2220.2 2019-04-05 20:00:00 95 4 2019 14
33996 PROVIDENCE HEIGHTS UNITED STATES 2266.7 2019-04-05 19:00:00 95 4 2019 14
33997 PROVIDENCE HEIGHTS UNITED STATES 2292.4 2019-04-05 18:00:00 95 4 2019 14
33998 PROVIDENCE HEIGHTS UNITED STATES 2197 2019-04-05 17:00:00 95 4 2019 14
I have tried things like the function below but get the ValueError: Week number not found in dictionary:
weeks = {
1: [1, 2, 3, 4, 5],
2: [6, 7, 8, 9],
3: [10, 11, 12, 13],
4: [14, 15, 16, 17],
5: [18, 19, 20, 21, 22],
6: [23, 24, 25, 26],
7: [27, 28, 29, 30, 31],
8: [32, 33, 34, 35],
9: [36, 37, 38, 39],
10: [40, 41, 42, 43, 44],
11: [45, 46, 47, 48],
12: [49, 50, 51, 52]
}
def get_week_number(day, month):
for num, days in weeks.items():
if day in days and num <= 12:
if month == num:
return num
raise ValueError("Week number not found in dictionary.")
# add a column for week number
df['week'] = ncData.apply(lambda row: get_week_number(row['day'], row['month']), axis=1)