We have a transaction table - AET holds accrual amount earned for each day for a customer (sub_account_id, money_group_id). At the end of the each month (pay date), We try to sum the accrual amt earned by each customer (sub_account_id, money_group_id).
AET table stores around 30 Million transactions per day. On pay date (last day of the month), we try to perform aggregate over 90 Millions (30M per day * 30 days). Table will hold data for last 90 days and any data older than 90 days will be pushed to Separate Archive table. So we currently have partition table based on position_ts and 40 hash sub partition on investment_product_cd (as we process based on investment_product_cd). Table DDL is given below.
Pay date query that identified to be time consuming given below. Currently it takes around 2.5 hours to complete for a large investment_product_cd which has 3 Million customer entries for each day.
investment_product_cd --> There are around 1500 to 2000 investment_product_cds. Customers (sub_account_id and money_group_id) are enrolled into this investment_product_cds. There are around 10 to 15 large investment_product_cds in which we have 3M customers enrolled into each. accrual_period_id --> this ID is generated for a pay period of each investment_product_cds. For example, one accrual_period_id will be generated for investment_product_cds - ABCD and pay period (Accrual Start Date - 01-July-2023 Accrual End Date - 31-july-2023) close_out_ind --> can have value as 'Y' or 'N'
We are trying to improve the performance of our Pay Date query. Trying to reduce the timing down from 150 minutes (2.5 hours). Any suggestions are welcomed.
I tried explaining with and without full parallel hint





