0

I am migrating some legacy code from an Oracle DB to an SQL Server DB. I have come along this piece of code, with the purpose of calculating the cumulative amount over the last 6 months/ year, for each record/group:

select
     a,b,c,d,e,trunc(record_date, 'MM'),
        sum(sum(amount)) over(
            partition by a, b, c, d 
            order by trunc(record_date, 'MM')
            range between interval 5 month preceding and current row) semester_amount,
        sum(sum(amount)) over(
            partition by a, b, c, d
            order by trunc(record_date, 'MM')
            range between interval '11' month preceding and current row) year_amount
from table
group by
      a,b,c,d,e,trunc(record_date, 'MM')

My goal is to implement it in SQL Server where the date range in window partitions isn't available.

Here is a sample table I use for experimentation:

drop table if exists dbo.test_window_frames;
create table dbo.test_window_frames (
    fielda char(1),
    fieldb char(1),
    fieldc char(2),
    fieldd char(2),
    fielde char(2),
    amount int,
    record_date date);
insert into dbo.test_window_frames (fielda, fieldb, fieldc, fieldd, fielde, amount, record_date)
values
    ('A', 'X', 'C1', 'D2', 'E2', 74, '2024-03-10'),
    ('A', 'X', 'C1', 'D2', 'E2', 34, '2024-03-12'),
    ('A', 'X', 'C1', 'D2', 'E2', 24, '2024-03-12'),
    ('A', 'X', 'C1', 'D2', 'E1', 24, '2024-03-25'),
    ('A', 'X', 'C1', 'D2', 'E1', 95, '2024-06-20'),
    ('A', 'X', 'C1', 'D2', 'E1', 81, '2024-06-27'),
    ('A', 'X', 'C1', 'D2', 'E1', 83, '2024-07-10'),
    ('A', 'X', 'C1', 'D2', 'E2', 73, '2024-07-27'),
    ('A', 'X', 'C1', 'D2', 'E2', 60, '2024-10-07'),
    ('A', 'X', 'C1', 'D2', 'E2', 50, '2024-10-07'),
    ('A', 'X', 'C2', 'D1', 'E1', 67, '2024-04-02'),
    ('A', 'X', 'C2', 'D1', 'E1', 69, '2024-12-04'),
    ('B', 'X', 'C1', 'D1', 'E1', 50, '2024-01-15'),
    ('B', 'X', 'C1', 'D1', 'E2', 30, '2024-02-20'),
    ('B', 'X', 'C2', 'D2', 'E1', 80, '2024-03-22'),
    ('B', 'X', 'C2', 'D2', 'E2', 90, '2024-05-10'),
    ('B', 'X', 'C1', 'D1', 'E1', 55, '2024-06-01'),
    ('B', 'X', 'C2', 'D2', 'E2', 20, '2024-07-19'),
    ('B', 'X', 'C1', 'D1', 'E2', 40, '2024-08-14'),
    ('B', 'X', 'C1', 'D1', 'E2', 5, '2024-08-14'),
    ('B', 'X', 'C1', 'D2', 'E1', 65, '2024-09-25'),
    ('B', 'X', 'C2', 'D1', 'E2', 35, '2024-10-12'),
    ('B', 'X', 'C2', 'D2', 'E1', 75, '2024-11-05');

The desired result is for each group (for example ('A', 'X', 'C1', 'D2', 'E2', '2024-03-01') the date is truncated) to have a row with the amount of the past 6 months and the past year since that group's date. (132 for both in this case)

Until now I tried my luck with subqueries:

select 
    t1.fielda,
    t1.fieldb,
    t1.fieldc,
    t1.fieldd,
    t1.fielde,
    dateadd(month, datediff(month, 0, t1.record_date), 0) as reference_date,
    (
        select sum(t2.amount)
        from dbo.test_window_frames t2
        where t2.fielda = t1.fielda
          and t2.fieldb = t1.fieldb
          and t2.fieldc = t1.fieldc
          and t2.fieldd = t1.fieldd
          and t2.record_date between dateadd(month, -5, t1.record_date) and t1.record_date
    ) as semester_amount,
    (
        select sum(t2.amount)
        from dbo.test_window_frames t2
        where t2.fielda = t1.fielda
          and t2.fieldb = t1.fieldb
          and t2.fieldc = t1.fieldc
          and t2.fieldd = t1.fieldd
          and t2.record_date between dateadd(month, -11, t1.record_date) and t1.record_date
    ) as year_amount
from 
    dbo.test_window_frames t1
group by 
    t1.fielda, t1.fieldb, t1.fieldc, t1.fieldd, t1.fielde, dateadd(month, datediff(month, 0, t1.record_date), 0);

But I get the following error:

Column 'dbo.test_window_frames.record_date' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Can you help me?

I attach the expected results that I believe are the output of the Oracle code, for the fieldA = 'A'.

fielda fieldb fieldc fieldd fielde reference_date semester_amount year_amount
A X C1 D2 E2 2024-03-01 132 132
A X C1 D2 E1 2024-03-01 24 24
A X C1 D2 E1 2024-06-01 200 200
A X C1 D2 E1 2024-07-01 283 283
A X C1 D2 E2 2024-07-01 205 205
A X C1 D2 E2 2024-10-01 183 315
A X C2 D1 E1 2024-04-01 67 67
A X C2 D1 E1 2024-12-01 69 136
13
  • So, for your sample data, what are the results you are expecting here exactly? Commented Sep 13, 2024 at 13:06
  • Calculate the sums in a subquery and then aggregate that stuff on the outside Commented Sep 13, 2024 at 13:14
  • Hey guys, I updated with the expected results, in case you wanna check them out Commented Sep 13, 2024 at 13:26
  • I'm not sure how 132 is the result for your first row, if I am honest. You need to explain what your Oracle code does and how you get the results you do. Commented Sep 13, 2024 at 13:29
  • I am sorry, I will try to edit the description to be more explanatory. So for the first group ('A', 'X', 'C1', 'D2', 'E2', '2024-03-01') it sums all the amounts that were in the past 6 months / a year before that record_date, not sys_date. So for that group there are 3 records prior, with amounts 74,34,24. On closer inspection, the oracle code seems problematic, because comparisons are made to the original date, but the group is made on the truncated date. But apart from that my question still stands. Commented Sep 13, 2024 at 13:35

2 Answers 2

0

The following should produce the same results as your original Oracle query:

select 
    t1.fielda,
    t1.fieldb,
    t1.fieldc,
    t1.fieldd,
    t1.fielde,
    rd.reference_date,
    sum(t1.amount) as current_amount, -- *** Added for reference ***
    (
        select sum(t2.amount)
        from dbo.test_window_frames t2
        where t2.fielda = t1.fielda
          and t2.fieldb = t1.fieldb
          and t2.fieldc = t1.fieldc
          and t2.fieldd = t1.fieldd
          --and t2.fielde = t1.fielde -- *** ??? ***
          and t2.record_date >= dateadd(month, -5, rd.reference_date) -- *** Modified ***
          and t2.record_date < dateadd(month, 1, rd.reference_date)   -- *** Modified ***
    ) as semester_amount,
    (
        select sum(t2.amount)
        from dbo.test_window_frames t2
        where t2.fielda = t1.fielda
          and t2.fieldb = t1.fieldb
          and t2.fieldc = t1.fieldc
          and t2.fieldd = t1.fieldd
          --and t2.fielde = t1.fielde -- *** ??? ***
          and t2.record_date >= dateadd(month, -11, rd.reference_date) -- *** Modified ***
          and t2.record_date < dateadd(month, 1, rd.reference_date)    -- *** Modified ***
    ) as year_amount
from 
    dbo.test_window_frames t1
cross apply ( -- *** Added ***
    select datetrunc(month, t1.record_date) as reference_date
    --select dateadd(month, datediff(month, 0, t1.record_date), 0) as reference_date
) rd
group by 
    t1.fielda, t1.fieldb, t1.fieldc, t1.fieldd, t1.fielde, rd.reference_date
order by
    t1.fielda, t1.fieldb, t1.fieldc, t1.fieldd, rd.reference_date, t1.fielde;

The first thing I did was to move the reference_date calculation to a CROSS APPLY so that it can be defined once and conveniently referenced several times elsewhere. For SQL Server 2022 and later, you can use the DATETRUNC() function. For earlier versions, the dateadd(xxx, datediff(xxx, 0, ...), 0) hack can still be used.

I also changed the date range in the subqueries to be half-open intervals >= current month - N and < current month + 1. This cleanly includes all dates through the end of the current month.

Lastly, I added the and t2.fielde = t1.fielde condition to the subqueries, as this was needed to match your desired results. However, I later commented out this condition, as it does not match the original Oracle query.

Because this query performs multiple range-lookups in subselects (instead of the original Oracle window functions), it is essential that those lookups be supported by an appropriate index such as test_window_frames(fielda, fieldb, fieldc, fieldd, record_date) INCLUDE(amount). (The INCLUDE part saves a bunch of clustered index seeks.)

Results:

fielda fieldb fieldc fieldd fielde reference
date
current
amount
semester
amount
year
amount
A X C1 D2 E1 2024-03-01 24 156 156
A X C1 D2 E2 2024-03-01 132 156 156
A X C1 D2 E1 2024-06-01 176 332 332
A X C1 D2 E1 2024-07-01 83 488 488
A X C1 D2 E2 2024-07-01 73 488 488
A X C1 D2 E2 2024-10-01 110 442 598
A X C2 D1 E1 2024-04-01 67 67 67
A X C2 D1 E1 2024-12-01 69 69 136
B X C1 D1 E1 2024-01-01 50 50 50
B X C1 D1 E2 2024-02-01 30 80 80
B X C1 D1 E1 2024-06-01 55 135 135
B X C1 D1 E2 2024-08-01 45 100 180
B X C1 D2 E1 2024-09-01 65 65 65
B X C2 D1 E2 2024-10-01 35 35 35
B X C2 D2 E1 2024-03-01 80 80 80
B X C2 D2 E2 2024-05-01 90 170 170
B X C2 D2 E2 2024-07-01 20 190 190
B X C2 D2 E1 2024-11-01 75 95 265

Note that the above does not match your posted results, but should match the original Oracle results. (See fiddle links below.)

If you uncomment the and t2.fielde = t1.fielde condition, the results will match your posted desired results, but not the original Oracle results.

fielda fieldb fieldc fieldd fielde reference
date
current
amount
semester
amount
year
amount
A X C1 D2 E1 2024-03-01 24 24 24
A X C1 D2 E2 2024-03-01 132 132 132
A X C1 D2 E1 2024-06-01 176 200 200
A X C1 D2 E1 2024-07-01 83 283 283
A X C1 D2 E2 2024-07-01 73 205 205
A X C1 D2 E2 2024-10-01 110 183 315
A X C2 D1 E1 2024-04-01 67 67 67
A X C2 D1 E1 2024-12-01 69 69 136
... ... ... ... ... ... ... ... ...

A slightly different approach is to group and calculate the monthly summaries in a CTE. The semester_amount and year_amount lookups are then performed against the CTE instead of the original table as part of the final query. It is even possible to combine the two sum calculations by retrieving the prior 12-months of data once and using a variation of conditional aggregation to calculate two different sums.

with monthly_sums as (
    select 
        t1.fielda,
        t1.fieldb,
        t1.fieldc,
        t1.fieldd,
        t1.fielde,
        rd.reference_date,
        sum(t1.amount) as current_amount
    from 
        dbo.test_window_frames t1
    cross apply (
        select datetrunc(month, t1.record_date) as reference_date
    ) rd
    group by 
        t1.fielda, t1.fieldb, t1.fieldc, t1.fieldd, t1.fielde, rd.reference_date
)
select 
    ms.fielda,
    ms.fieldb,
    ms.fieldc,
    ms.fieldd,
    ms.fielde,
    ms.reference_date,
    a.semester_amount,
    a.year_amount
from 
    monthly_sums ms
cross apply (
    select
        sum(ms2.current_amount * x.is_semester) as semester_amount,
        sum(ms2.current_amount) as year_amount
    from monthly_sums ms2
    cross apply (
        select case when ms2.reference_date >= dateadd(month, -5, ms.reference_date)
               then 1 else 0 end as is_semester
    ) x
    where ms2.fielda = ms.fielda
      and ms2.fieldb = ms.fieldb
      and ms2.fieldc = ms.fieldc
      and ms2.fieldd = ms.fieldd
      and ms2.reference_date >= dateadd(month, -11, ms.reference_date)
      and ms2.reference_date <= ms.reference_date  -- Inclusive
) a
order by
    ms.fielda, ms.fieldb, ms.fieldc, ms.fieldd, ms.reference_date, ms.fielde;

if you later add date-range filters, take care to include the extra 12 months of prior data in the CTE calculations to support the later semester_amount and year_amount calculations. The final query would include a filter to exclude the earlier data.

You should run performance tests on the various queries to determine what works best for your data. Neither is likely to be as performant as one that used a values-between window range, if only SQL Server would support that.

See this db<>fiddle for a demo of the above. For comparison, this db<>fiddle shows the results of the original Oracle query against the same posted sample data.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much for the extensive answer and material! This seems to work. Also, the desired results were made in a rush by hand, where I accidentally grouped by the extra column, as you gathered. The rest of the queries seem right. Also sorry for the late reply, last week was really busy. Have a nice one!
0

Here is another Oracle query that should do the same as your original but should be much easier to convert to SQL Server syntax, the idea being to calculate a pseudo row number based on the month difference between rows of the same partition, then you join with the original rows and conditionally sum the amount based on the previously calculated month difference:

with convdata(fielda, fieldb, fieldc, fieldd, fielde, record_date, amount,rnm) as (
    select fielda, fieldb, fieldc, fieldd, fielde, record_date, amount, 
        sum(rnm) over(partition by fielda, fieldb, fieldc, fieldd, fielde order by record_date) as rnm
    from (
        select fielda, fieldb, fieldc, fieldd, fielde,
            trunc( record_date, 'MM' ) as record_date,
            sum(amount) as amount,
            coalesce(months_between(
                trunc( record_date, 'MM' ),
                lag(trunc( record_date, 'MM' )) over(partition by fielda, fieldb, fieldc, fieldd, fielde order by trunc( record_date, 'MM' ))
                ),0) as rnm
        from data d
        group by fielda, fieldb, fieldc, fieldd, fielde, trunc( record_date, 'MM' )
    )
)
select c1.fielda, c1.fieldb, c1.fieldc, c1.fieldd, c1.fielde,
    c1.record_date, c1.amount, 
    sum(case when c2.rnm between c1.rnm - 6 and c1.rnm then c2.amount else 0 end) as sum6m,
    sum(case when c2.rnm between c1.rnm - 12 and c1.rnm then c2.amount else 0 end) as sum12m
from convdata c1
join convdata c2 on c1.fielda = c2.fielda 
    and c1.fieldb = c2.fieldb and c1.fieldc = c2.fieldc and c1.fieldd = c2.fieldd and c1.fielde = c2.fielde
    and c2.rnm between c1.rnm - 12 and c1.rnm
group by c1.fielda, c1.fieldb, c1.fieldc, c1.fieldd, c1.fielde, c1.record_date, c1.amount
order by c1.record_date, c1.fielda, c1.fieldb, c1.fieldc, c1.fieldd, c1.fielde,  c1.amount
;

Remain to convert to SQL Server syntax: adding aliases to subqueries, TRUNC date to month and MONTHS_BETWEEN, the rest should be compatible as is.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.