0

We have a transaction table - AET holds accrual amount earned for each day for a customer (sub_account_id, money_group_id). At the end of the each month (pay date), We try to sum the accrual amt earned by each customer (sub_account_id, money_group_id).

AET table stores around 30 Million transactions per day. On pay date (last day of the month), we try to perform aggregate over 90 Millions (30M per day * 30 days). Table will hold data for last 90 days and any data older than 90 days will be pushed to Separate Archive table. So we currently have partition table based on position_ts and 40 hash sub partition on investment_product_cd (as we process based on investment_product_cd). Table DDL is given below.

Table DDL

Index details

Pay date query that identified to be time consuming given below. Currently it takes around 2.5 hours to complete for a large investment_product_cd which has 3 Million customer entries for each day.

Pay Date Query

investment_product_cd --> There are around 1500 to 2000 investment_product_cds. Customers (sub_account_id and money_group_id) are enrolled into this investment_product_cds. There are around 10 to 15 large investment_product_cds in which we have 3M customers enrolled into each. accrual_period_id --> this ID is generated for a pay period of each investment_product_cds. For example, one accrual_period_id will be generated for investment_product_cds - ABCD and pay period (Accrual Start Date - 01-July-2023 Accrual End Date - 31-july-2023) close_out_ind --> can have value as 'Y' or 'N'

We are trying to improve the performance of our Pay Date query. Trying to reduce the timing down from 150 minutes (2.5 hours). Any suggestions are welcomed.

I tried explaining with and without full parallel hint

Explain plan without hint

Explain plan with hint

enter image description here

2 Answers 2

1

For a summarization query like this, you want to ensure:

  1. that you are doing full segment scans, not using indexes
  2. that you are partition pruning
  3. that you are getting full use of parallelism
  4. that you have sufficient PGA to avoid multipass sorting

First, ditch the ORDER BY. There's no point to adding that expensive sort when you're simply writing the results to another table.

Then, try these hints:

SELECT /*+ FULL(aet) PARALLEL(16) */ aet.accrual_period_id, . . .

Also check global memory bound in v$pgastat and ensure it's at its maximum of 1G. If it isn't, have the DBA raise the database's pga_aggregate_target until the global bound hits 1G.

Ask how many CPUs are available and if its a beefy box with many dozens or hundreds of CPU cores, go ahead and raise the parallel degree from 16 to 32... not only does that throw more cores into the mix, it will also use more total PGA memory before spilling to temp.

Make sure you are actually getting the parallelism you're asking for. Look at the # of rows in gv$session for the sql_id you are running and make sure it's not just one row. It should be 2x the degree requested. If you're getting less, ask your DBA to look into why you're being downgraded.

Sign up to request clarification or add additional context in comments.

11 Comments

global memory bound - 1073741824 bytes Size - Standard E64ds v4 vCPUs - 64 RAM - 504 GiB Regarding ORDER BY Clause in SELECT query, We have this pay date query in Stored Proc and returning the results as REF Cursor to the Application. Our Application reads it and forms smaller chunks of batches and send it to their downstream. Just in case of any failure, we have ordered the results so that our App can continue from where they left off.
I have posted the explain plan with and without full parallel hint as suggested in the original post due to size limitation here in comments
Currently our transaction table can have last 90 days data. So we are having around 90 main partitions and For doing the aggregation on Pay Date (last day of the month - For example, - 31-July-2023), We have to scan thru 31 partitions (01-July-2023 till 31-july-2023) out of 90 partitions and take the acc_amt of customer from 31 partitions and then sum it up. Are we not scanning 1/3 partitions?
So i am thinking, instead of having partition daily, Can we add one additional column - paydate_ts and maintain pay date info (last date of the month) and partition based on that column so that all the relevant data for aggregation are found under the same partition. Any thoughts?
And no, changing the partitioning to monthly instead of daily won't help. Oracle will do the same amount of work in either case. That PARTITION RANGE ITERATOR tells you it is pruning to only the days you need and skipping the other partitions. Even better, you are isolating only one of the hash subpartitions PARTITION HASH SINGLE due to your equality predicate on that key. I think this should work great the way it is. Also seing that your data isn't all that big (Oracle expects to use only 1GB of temp) forget about messing with PGA settings.
|
0

maybe spread out the pain.

add a trigger to aggregate each users transaction to a summary table. then all you need to do is query the current summary for each user.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.