I have a data frame of product numbers that are on sales promotions. The columns include the product number, start date, end date, promotion type and promotion description. The dates could span up to 4 months. I need to add rows to account for the months between the start and end dates.
Here is an example of the data currently:
import pandas as pd
sales_dict = {}
sales_dict['item'] = ['100179K', '100086K']
sales_dict['start_date'] = [201703, 201801]
sales_dict['end_date'] = [201707, 201802]
sales_dict['promotin_type'] = [1,0]
sales_dict['promotion_desc'] = [0,1]
df = pd.DataFrame.from_dict(sales_dict)
I tried to create a data frame of dates in year_month format from the beginning of the time frame through the end then join the two datasets. But some data seemed to fall out.
I also looked at Creating a single column of dates from a column of start dates and a column of end dates - python but not sure now to fill all the other columns correctly.
This is what I wanted to happen.
sales_dict = {}
sales_dict['item'] = ['100179K','100179K','100179K','100179K','100179K','100086K','100086K']
sales_dict['start_date'] = [201703, 201704, 201705, 201706, 201707, 201801, 201802]
sales_dict['promotin_type'] = [1,1,1,1,1, 0,0]
sales_dict['promotion_desc'] = [0,0,0,0,0,1,1]
df = pd.DataFrame.from_dict(sales_dict)