3

I have some difficulty using 'LEFT JOIN LATERAL' function with postgresql 9.5.

In my table, there are three columns of 'ID', 'DATE', 'CODE'. One person (ID) have multiple rows as below. The number of ID is 362 and total row number is about 2500000.

ID   /  DATE     / CODE
1    /  20020101 / drugA
1    /  20020102 / drugA
1    /  20020103 / drugB
1    /  20020104 / drugA
1    /  20020105 / drugA
1    /  20020106 / drugB
1    /  20020107 / drugA
2    /  ...      / ...

I need to summarize information of drug A used between the first day and last day of drugB.

In the above case, only two rows should be remained for ID (1) [between 20020103 ~ 20020106; the period of drugB].

1    /  20020104 / drugA
1    /  20020105 / drugA

To take this job, I write SQL code using 'LEFT LATERAL JOIN' as below.

SELECT * FROM (SELECT ID, min(DATE) as start_date, max(DATE) as end_date from MAIN_TABLE WHERE CODE = 'drugA' GROUP BY ID) AA
LEFT JOIN LATERAL (SELECT ID, COUNT(ID) as no_tx, min(DATE) as fday_tx, max(DATE) lday_tx from MAIN_TABLE WHERE CODE = 'drugB' AND DATE > AA.start_date AND DATE < AA.end_date GROUP BY ID) as BB USING(ID);

There are only 362 person ID, but this postgresql code take about 2 mins.

It's too slow. Therefore, I tried another SQL code using subquery.

SELECT * FROM (SELECT ID, min(DATE) as start_date, max(DATE) as end_date from MAIN_TABLE WHERE CODE ='drugA' GROUP BY ID) AA
LEFT JOIN (
       SELECT ID, COUNT(ID) as no_tx, min(DATE) as fday_tx, max(DATE) lday_tx FROM (SELECT ID, DATE, CODE FROM MAIN_TABLE) BB
            LEFT JOIN (SELECT ID, min(DATE) as start_date, max(DATE) as end_date from MAIN_TABLE WHERE CODE ='drugA' GROUP BY ID) CC USING (ID)
            WHERE CODE = 'drugB' and DATE > start_date and DATE < end_date GROUP BY ID
            ) DD USING (ID);

This code is not simple but very fast (take only 1.6 sec).

When I compare the explain of two codes, the second code use hash join, but the first code do not.

Can I get some hint to improve the first code with 'LEFT LATERAL JOIN' function more efficiently?

2
  • Check both execution plans (using explain (analyze, verbose) and you'll see Commented Feb 15, 2016 at 15:55
  • Shouldn't join condition be inside the subquery when you're using LATERAL join ? (i.e. LEFT JOIN LATERAL (..... from MAIN_TABLE a WHERE a.id=aa.id AND ... ) BB .... Commented Feb 15, 2016 at 15:58

1 Answer 1

3

Why not just use a join and group by?

SELECT AA.ID, COUNT(B.ID) as no_tx, min(B.DATE) as fday_tx, max(B.DATE) as lday_tx,
       AA.start_date, AA.end_date
FROM (SELECT ID, min(DATE) as start_date, max(DATE) as end_date 
      FROM MAIN_TABLE
      WHERE CODE = 'drugA'
      GROUP BY ID
     ) AA LEFT JOIN
     MAIN_TABLE b
     ON b.CODE = 'drugB' AND b.DATE > AA.start_date AND b.DATE < AA.end_date
GROUP BY AA.ID,  AA.start_date, AA.end_date;

Or, perhaps more efficiently, window functions:

SELECT ID, SUM(CASE WHEN code = 'drugB' THEN 1 ELSE 0 END) as no_tx,
       MIN(CASE WHEN code = 'drugB' THEN DATE END) as fday_tx,
       MIN(CASE WHEN code = 'drugB' THEN DATE END) as lday_tx,
       start_date, end_date
FROM (SELECT t.*,
             MIN(CASE WHEN code = 'drugA' THEN date END) as start_date,
             MAX(CASE WHEN code = 'drugB' THEN date END) as end_date
      FROM MAIN_TABLE t
     ) t
WHERE code in ('drugA', 'drugB') AND
      date between start_date and end_date
GROUP BY t.id;
Sign up to request clarification or add additional context in comments.

1 Comment

the first SQL code work well (only 1.3 sec). However, second SQL code print error as below 'ERROR: column "t.id" must appear in the GROUP BY clause or be used in an aggregate function LINE 5: FROM (SELECT t.*, ^ ********** Error ********** ERROR: column "t.id" must appear in the GROUP BY clause or be used in an aggregate function SQL state: 42803 Character: 336'

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.