I have a table (dataset_final) that contains data on the number of sales (field quantity) of goods in a particular store for a particular week of the year. Unique goods about 200 thousand, about 50 stores, the period of 6 years.
dataset_final
+---------+-------------+---------+----------+----------+
| year_id | week_number | good_id | store_id | quantity |
+---------+-------------+---------+----------+----------+
| 2017 | 37 | 137233 | 9 | 1 |
+---------+-------------+---------+----------+----------+
| 2017 | 38 | 137233 | 9 | 4 |
+---------+-------------+---------+----------+----------+
| 2017 | 40 | 137233 | 9 | 3 |
+---------+-------------+---------+----------+----------+
| 2016 | 35 | 152501 | 23 | 6 |
+---------+-------------+---------+----------+----------+
| 2016 | 37 | 152501 | 23 | 3 |
+---------+-------------+---------+----------+----------+
I would like the missing values, i.e. when the combination of good and store was not sold in a certain week of the year, to fill in the zero. For example.
+---------+-------------+---------+----------+----------+
| year_id | week_number | good_id | store_id | quantity |
+---------+-------------+---------+----------+----------+
| 2017 | 37 | 137233 | 9 | 1 |
+---------+-------------+---------+----------+----------+
| 2017 | 38 | 137233 | 9 | 4 |
+---------+-------------+---------+----------+----------+
| 2017 | 40 | 137233 | 9 | 3 |
+---------+-------------+---------+----------+----------+
| 2016 | 35 | 152501 | 23 | 6 |
+---------+-------------+---------+----------+----------+
| 2016 | 37 | 152501 | 23 | 3 |
+---------+-------------+---------+----------+----------+
| 2017 | 39 | 137233 | 9 | 0 |
+---------+-------------+---------+----------+----------+
| 2016 | 36 | 152501 | 23 | 0 |
+---------+-------------+---------+----------+----------+
I wanted to do this: find all unique combinations of year_id, week_number, good_id, store_id and add only those that are not in the dataset_final table. My query:
WITH t1 AS (SELECT DISTINCT
[year_id]
,[week_number]
,[good_id]
,[store_id]
FROM [fs_db].[dbo].[ds_dataset_final]),
t2 AS (SELECT DISTINCT [year_id], [week_number] FROM [fs_db].[dbo].[ds_dataset_final])
SELECT t2.[year_id], t2.[week_number], t1.[good_id], t1. [store_id] FROM t1
full join t2 ON t2.[year_id]=t1.[year_id] AND t2.[week_number]=t2.[week_number]
This query produces about 1.2 billion unique combinations, which seems too much.
Also, I take into account the combination only from the beginning of sales of goods, for example, if the table has sales of a particular product only from 2017, then I do not need to fill in earlier data.
goods_idtoo, Are you after one row for every year, month, store and good?