3

I am looking to construct a SQL query which is able to sum up installment values by month. Ordinarily this wouldn't be too hard as you would simply sum the installment values and group by month. However, the problem is not that easy and in the rest of the post, I'll illustrate why and solicit any help people are able to offer.

First it's important to note the installments column. Where installments is 1, that means the total value is paid at the time of the purchase. Where installments is greater than 1, this means that the total value is paid in the current month as well as the following months. For example, if we see transaction_id 9 and 10, this is a $100 transaction that has 2 installments which means that $50 will be paid in February and $50 will be paid in March.

Consider that we want to see the monthly bills for credit_card_id = 11111111. If we look to the installments column, we can see that the correct output should be as follows:

  • January: 19.99 + 75.3
  • February: 1337 + 75.3
  • March: 75.3

Again, just to be clear, the 75.3 in March occurs because we had a 3 installment transaction in January meaning the customer would be charged 75.3 in January, February, and March. The issue is that I don't know how to create the category for March from the given data.

Transactions

First, I re-created the table in SQL and was easily able to get all the transactions for the card by month with the following SQLite query

select strftime('%m', transaction_date) as Month, total_value, installment_value, installments 
from transactions 
WHERE credit_card_id = '11111111';

which outputs a table that looks like this output

However, it wasn't obvious how to split the 3 installment periods into 01, 02, and 03, so I created a new table with a column txn which is meant to give an id to the unique transactions that can be thought of as 1 group.

CREATE TABLE transactions (
    transaction_id int primary key,
    credit_card_id int,
    transaction_date timestamp,
    merchant_name varchar(256),
    total_value decimal(19,4),
    installment_value decimal(19,4),
    installments int,
    txn int
);

insert into transactions values(1,11111111,'2018-01-10T00:00:00','Colorful Soaps', 19.99, 19.99, 1, 1);
insert into transactions values(2,22222222,'2018-01-11T00:01:00','Cantina da Mamma',43.5,43.5,1,2);
insert into transactions values(3,33333333,'2018-01-12T01:02:00','Boulevard Hotel',129,129,1,3);
insert into transactions values(4,11111111,'2018-01-15T11:11:11','Micas Bar',225.9,75.3,3,4);
insert into transactions values(5,11111111,'2018-01-15T11:11:11','Micas Bar',225.9,75.3,3,4);
insert into transactions values(6,11111111,'2018-01-15T11:11:11','Micas Bar',225.9,75.3,3,4);
insert into transactions values(7,22222222,'2018-01-18T22:10:01','IPear Store',9999.99,9999.99,1,5);
insert into transactions values(8,11111111,'2018-02-20T21:08:32','Forrest Paintball',1337,1337,1,6);
insert into transactions values(9,44444444,'2018-02-22T00:05:30','Unicorn Costumes',100,50,2,7);
insert into transactions values(10,44444444,'2018-02-22T00:05:30','Unicorn Costumes',100,50,2,7);

My questions are

  1. Is it possible to get an output of the format I identified above in SQLite and if so, how?
  2. Do I have to have the txn column in order to get this information?

Thank you for your help.

3 Answers 3

1

Assuming you run SQLite 3.25+ version, consider using CTE and window function that creates a running count by same credit_card_id and transaction_date and uses this value to add needed months to transaction date. From there, aggregate according to new calculated date, install_date.

WITH cte AS
   (SELECT *,
       DATE(transaction_date, 
            '+' || (ROW_NUMBER() 
                      OVER(PARTITION BY transaction_date, credit_card_id 
                           ORDER BY transaction_date) - 1)
                || ' month'
            ) AS install_date       
    FROM transactions)

SELECT credit_card_id, 
       STRFTIME('%Y', install_date) AS install_year, 
       STRFTIME('%m', install_date) AS install_month, 
       SUM(installment_value) AS sum_installment_value
FROM cte 
GROUP BY credit_card_id, 
         STRFTIME('%Y', install_date), 
         STRFTIME('%m', install_date)
ORDER BY credit_card_id, 
         STRFTIME('%Y', install_date), 
         STRFTIME('%m', install_date);

Rextester Demo using PostgreSQL, since AFAIK no online fiddle (SQLFiddle, SQLiteonline, DBFiddle, etc.) supports SQLite with window functions

Sign up to request clarification or add additional context in comments.

1 Comment

Be carefull, when installment + transaction_date exceeds a year, it will fail. I tryed to warn about this in my answer (when case).
1

SQLlite has ROW_NUMBER() (checked here SQLlite).

   SELECT 
        installment_month
        ,credit_card_id
        ,SUM(installment_value) 
    FROM (
        SELECT 
            CASE WHEN strftime('%m',transaction_date) + ROW_NUMBER () OVER(PARTITION BY credit_card_id, transaction_date ORDER BY transaction_date) - 1 > 12
                    THEN strftime('%Y',transaction_date)*100+strftime('%m',transaction_date) + ROW_NUMBER () OVER(PARTITION BY credit_card_id, transaction_date ORDER BY transaction_date) - 1 + 88
                    ELSE strftime('%Y',transaction_date)*100+strftime('%m',transaction_date) + ROW_NUMBER () OVER(PARTITION BY credit_card_id, transaction_date ORDER BY transaction_date) - 1
            END as installment_month
            ,* 
        from transactions
        ) AS a
    GROUP by installment_month, credit_card_id

An issue comes when installments exceed two years. You will have to work that part. I would write this pice of code (case when...) inside a function it would make the whole query more clear.

3 Comments

SNR, can you explain why/how the failure happens when installments exceed two years?
Modify the data in your table with an installment bigger than 1 on december. Accepted answer will fail. Do the same with installment date bigger than 13 on december, my answer will fail. The @Caius Jard answer it won't fail as long as your calendar is large enough. The answer just have to be because now is generating too many rows per credit_card_id and we won't obtain the expected result.
@g.humpkins (edit sent) you can achieve the requirements by using a select distinct inside the from clause using the output of the inner join. You will get the expected result. But maybe there is a better way to do this, let's wait and see...
1

Here's a solution that doesn't require row numbering/will work in much older SQLite (any version that supports date() basically).. It just relies on a join to a calendar table (that you can generate using a variety of techniques, but in the linked example I just generated N rows by creating a table and doing a direct insert of the data I needed for the query) that has one row for the 1st day of every month. It uses a cartesian join condition that causes e.g. 3 rows for every payment that has 3 installments:

select 
  t.credit_card_id,
  date(cal.d, '-1 month') as month_of_installment,
  sum(t.installment_value)
from
  cal inner join transactions t on
  t.transaction_date between date(cal.d, '-'||installments||' months') and cal.d
group by
  t.credit_card_id,
  date(cal.d, '-1 month')

You can see the setup at https://www.db-fiddle.com/f/ogj2hK3cMwqp46MY6uVwX8/0

Incidentally, something is messed up in your question or your example data.

The monthly payments for credit card ID 11111111 are:

2018-01   245.89
2018-02   1562.9
2018-03   225.9

Your example data has 3 simultaneous payments to Micas Bar. We know these are different because they have a different transaction ID despite identical other data. January is thus 75.3 + 75.3 + 75.3 + 19.99, and not just 19.99 + 75.3 as the question asserts.

To see more of how the query works, run the non grouped form:

select 
  t.credit_card_id,
  date(cal.d, '-1 month') as month_of_installment,
  t.*
from
  cal inner join transactions t on
  t.transaction_date between date(cal.d, '-'||installments||' months') and cal.d
order by
  t.credit_card_id,
  date(cal.d, '-1 month')

Most DBAs I've met advocate having a numbers/dates table kicking around in a db to generate queries such as this - it's a fast way of generating sequences of rows and gives you a row to join to in cases where there are e.g. no transactions for a month (you can left join the transactions table onto the calendar table and get a single row, that sums to 0, for months with no transactions). Generating a bunch of months/days rows for the next 100 years is a trivial one-time op

1 Comment

when installment is bigger than 1 the installment_value has to be propagated for each moth. Answer edited to achieve requirements.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.