I have a DataFrame like this:
df = pd.DataFrame(data= {'month' : [2,7,4,8], 'sales' : [10,40,70,50]})
I would like to get the sum of sales aggregated by the month. However, I want to have two groups of month combined, the first for months 1-6 (resulting in sales of 80) and the second for the months 7-12 (resulting in 90).
What's the best way to do this?
duckdb(pip install duckdb) you could answer your query using SQL directly on the dataframe as such:duckdb.query("SELECT CASE WHEN month <= 6 THEN 1 ELSE 2 END as halfyear, sum(sales) FROM df GROUP BY halfyear").to_df().