3

I am trying to query another software's DB and so I have no control over making indexes, functions, etc. I have read-only access, so I don't believe I can make a temporary table even. I just can't understand why the difference.

Here is my basic query:

WITH vCriteria AS (
    SELECT vSub.visit_id, vd.vdiag_diag from visit vSUB  
    JOIN insurance_company_table1 inscomp1 ON inscomp1.inscomp1_comp = vSub.visit_ins 
        and inscomp1.inscomp1_arid = vSub.visit_arid
    JOIN da152 da ON da.da152_arid = vSub.visit_arid 
    LEFT JOIN insurance1 i ON i.is1_num = vSub.visit_id AND i.is1_ins_full = vSub.visit_ins
    LEFT JOIN visit_diagnosis vd ON vd.vdiag_visit_key = vSub.visit_id

    WHERE vSub.visit_mr_fdt 
    BETWEEN (now() - 50 * INTERVAL '1 day') AND (now() + 10 * INTERVAL '1 day') 
    AND NULLIF (vSub.visit_disch_date, '0001-01-01' ::date) IS NOT NULL 
    AND Now() >=(vSub.visit_disch_date + inscomp1.inscomp1_lagdays * INTERVAL '1 day') 
    AND vSub.visit_servicecd_key ~ '^[0-9]+$|[FHJbKkN\\+]' = false 
    AND LEFT (vSub.visit_id, 1) NOT IN ('F', 'P', 'J') 
    AND NULLIF (i.is1_bill_dt, '0001-01-01' ::date) IS NULL 
    AND vSub.visit_disch_date <= da.da152_chgdt
) 

Select vCriteria.* from vCriteria

It executes in 609 ms

I add 1 join and it takes a lot longer: LEFT JOIN visit_diagnosis vd ON vd.vdiag_visit_key = vSub.visit_id

WITH vCriteria AS (
    SELECT vSub.visit_id, vd.vdiag_diag from visit vSUB  
    JOIN insurance_company_table1 inscomp1 ON inscomp1.inscomp1_comp = vSub.visit_ins 
        and inscomp1.inscomp1_arid = vSub.visit_arid
    JOIN da152 da ON da.da152_arid = vSub.visit_arid 
    LEFT JOIN insurance1 i ON i.is1_num = vSub.visit_id AND i.is1_ins_full = vSub.visit_ins
   LEFT JOIN visit_diagnosis vd ON vd.vdiag_visit_key = vSub.visit_id

    WHERE vSub.visit_mr_fdt 
    BETWEEN (now() - 50 * INTERVAL '1 day') AND (now() + 10 * INTERVAL '1 day') 
    AND NULLIF (vSub.visit_disch_date, '0001-01-01' ::date) IS NOT NULL 
    AND Now() >=(vSub.visit_disch_date + inscomp1.inscomp1_lagdays * INTERVAL '1 day') 
    AND vSub.visit_servicecd_key ~ '^[0-9]+$|[FHJbKkN\\+]' = false 
    AND LEFT (vSub.visit_id, 1) NOT IN ('F', 'P', 'J') 
    AND NULLIF (i.is1_bill_dt, '0001-01-01' ::date) IS NULL 
    AND vSub.visit_disch_date <= da.da152_chgdt
) 

Select vCriteria.* from vCriteria

It takes over 13 seconds to execute

So with the above query very simplified here are examples I tried. If I do something like this:

SELECT t.id_field from table t
JOIN . . .
JOIN . . .    
WHERE <10 rows of filtering criteria>

it executes in 766 ms and brings back 411 records

When I run this query alone:

select t.id_field, t2.field1 from table t
LEFT JOIN table2 t2 ON t2.t_id_field = t.id_field
WHERE t.id_field in ( <I list out the varchar(8) values here from first query> )

It executes in 3.031 sec

But if I try to put the queries together, like this

select t.id_field, t2.field1 from table t
LEFT JOIN table2 t2 ON t2.t_id_field = t.id_field
WHERE t.id_field in ( 
    SELECT t.id_field from table t
    JOIN . . .
    JOIN . . .    
    WHERE <10 rows of filtering criteria>
 )

It takes 19.406 sec

I have tried a WITH statement like so:

WITH vt AS (
    SELECT t.id_field from table t
    JOIN . . .
    JOIN . . .    
    WHERE <10 rows of filtering criteria>
)
select vt.id_field, t2.field1 from vt
LEFT JOIN table2 t2 ON t2.t_id_field = vt.id_field

It executes in 12 seconds

I have tried this:

select t.id_field, t2.field1 from ( 
        SELECT tsub.id_field from table tsub
        JOIN . . .
        JOIN . . .    
        WHERE <10 rows of filtering criteria>
     ) t
LEFT JOIN table2 t2 ON t2.t_id_field = t.id_field

Executes in 12.750 sec

Another try was just this:

SELECT t.id_field, t2.field1 from table t
JOIN . . .
JOIN . . .
LEFT JOIN table2 t2 ON t2.t_id_field = t.id_field  
WHERE <10 rows of filtering criteria>

Executed in 12.813 sec

Any thoughts on how to get it to at least be 3 seconds like the one above where I just put the values manually in?

This is the Execution Plan

QUERY PLAN
Nested Loop Left Join  (cost=109243.37..119017.89 rows=1 width=83)
  CTE vc
    ->  Nested Loop  (cost=14871.86..109243.37 rows=1 width=7)
          ->  Nested Loop  (cost=14871.86..109242.01 rows=4 width=24)
                Join Filter: ((vsub.visit_disch_date <= da.da152_chgdt) AND (vsub.visit_arid = da.da152_arid))
                ->  Index Scan using da152_pkey on da152 da  (cost=0.00..10.40 rows=4 width=9)
                ->  Materialize  (cost=14871.86..109230.80 rows=12 width=19)
                      ->  Hash Right Join  (cost=14871.86..109230.74 rows=12 width=19)
                            Hash Cond: (((i.is1_num)::text = (vsub.visit_id)::text) AND ((i.is1_ins_full)::text = (vsub.visit_ins)::text))
                            Filter: (NULLIF(i.is1_bill_dt, '0001-01-01'::date) IS NULL)
                            ->  Seq Scan on insurance1 i  (cost=0.00..90993.64 rows=336264 width=14)
                            ->  Hash  (cost=14837.16..14837.16 rows=2313 width=19)
                                  ->  Seq Scan on visit vsub  (cost=0.00..14837.16 rows=2313 width=19)
                                        Filter: ((NULLIF(visit_disch_date, '0001-01-01'::date) IS NOT NULL) AND ((visit_servicecd_key)::text !~ '^[0-9]+$|[FHJbKkN\\+]'::text) AND ("left"((visit_id)::text, 1) <> ALL ('{F,P,J}'::text[])) AND (visit_mr_fdt >= (now() - '50 days'::interval)) AND (visit_mr_fdt <= (now() + '10 days'::interval)))
          ->  Index Scan using inscomp1_pkey on insurance_company_table1 inscomp1  (cost=0.00..0.33 rows=1 width=11)
                Index Cond: ((inscomp1_arid = vsub.visit_arid) AND ((inscomp1_comp)::text = (vsub.visit_ins)::text))
                Filter: (now() >= (vsub.visit_disch_date + ((inscomp1_lagdays)::double precision * '1 day'::interval)))
  ->  CTE Scan on vc  (cost=0.00..0.02 rows=1 width=12)
  ->  Index Scan using vdiag_pkey on visit_diagnosis vd  (cost=0.00..9774.48 rows=1 width=71)
        Index Cond: ((vdiag_visit_key)::text = (vc.visit_id)::text)
        Filter: ((vdiag_type)::text = 'A'::text)

And the Explain (Analyze, Verbose) (FOR "WITH" QUERY EXAMPLE ABOVE)

QUERY PLAN
Nested Loop Left Join  (cost=109243.36..119017.87 rows=1 width=83) (actual time=418.062..13796.260 rows=414 loops=1)
  Output: vc.visit_id, vd.vdiag_visit_key, vd.vdiag_arxseq, vd.vdiag_element, vd.vdiag_type, vd.vdiag_date, vd.vdiag_diag, vd.vdiag_arid
  CTE vc
    ->  Nested Loop  (cost=14871.84..109243.36 rows=1 width=7) (actual time=404.477..1006.518 rows=414 loops=1)
          Output: vsub.visit_id
          ->  Nested Loop  (cost=14871.84..109242.00 rows=4 width=24) (actual time=404.432..990.137 rows=418 loops=1)
                Output: vsub.visit_id, vsub.visit_ins, vsub.visit_arid, vsub.visit_disch_date, da.da152_arid
                Join Filter: ((vsub.visit_disch_date <= da.da152_chgdt) AND (vsub.visit_arid = da.da152_arid))
                Rows Removed by Join Filter: 1374
                ->  Index Scan using da152_pkey on public.da152 da  (cost=0.00..10.40 rows=4 width=9) (actual time=0.004..0.017 rows=4 loops=1)
                      Output: da.da152_rel_file_key_num, da.da152_contrl, da.da152_rcdt, da.da152_chgdt, da.da152_ar_rev, da.da152_ar_rc, da.da152_ar_rcar, da.da152_unpost, da.da152_lrun, da.da152_ar_month, da.da152_insdt, da.da152_otc_rcdt, da.da152_sum_rev, da.da152_sum_rc, da.da152_sum_rcar, da.da152_begbal, da.da152_netar, da.da152_arend, da.da152_eomtrial, da.da152_rc_da, da.da152_rc_amt_otc, da.da152_rc_amt_cp, da.da152_chg_thru, da.da152_ar_ldom, da.da152_csnum, da.da152_sec_e_bill, da.da152_uform, da.da152_jform, da.da152_rate, da.da152_contract_bill, da.da152_up_front, da.da152_bad_debt, da.da152_pat_refund, da.da152_2cnd_ins, da.da152_pat_type1, da.da152_pat_type2, da.da152_pat_srvcd1_old, da.da152_pat_srvcd2_old, da.da152_wo_colcd1, da.da152_wo_alpha1s, da.da152_wo_alpha1e, da.da152_wo_colcd2, da.da152_wo_alpha2s, da.da152_wo_alpha2e, da.da152_wo_colcd3, da.da152_wo_alpha3s, da.da152_wo_alpha3e, da.da152_ins_prim, da.da152_tic_xfc1, da.da152_tic_xfc2, da.da152_ins_fc, da.da152_wo_sitnum, da.da152_wo_sit_cd, da.da152_ins2dt, da.da152_multicoqual, da.da152_multicoid, da.da152_multicodir, da.da152_multicoglnum, da.da152_pat_srvcd1, da.da152_pat_srvcd2, da.da152_adchg, da.da152_adchg_var, da.da152_month_closedt, da.da152_pat_colcd, da.da152_nonaracct, da.da152_isfup_days, da.da152_arid
                ->  Materialize  (cost=14871.84..109230.79 rows=12 width=19) (actual time=101.103..246.563 rows=448 loops=4)
                      Output: vsub.visit_id, vsub.visit_ins, vsub.visit_arid, vsub.visit_disch_date
                      ->  Hash Right Join  (cost=14871.84..109230.73 rows=12 width=19) (actual time=404.408..983.328 rows=448 loops=1)
                            Output: vsub.visit_id, vsub.visit_ins, vsub.visit_arid, vsub.visit_disch_date
                            Hash Cond: (((i.is1_num)::text = (vsub.visit_id)::text) AND ((i.is1_ins_full)::text = (vsub.visit_ins)::text))
                            Filter: (NULLIF(i.is1_bill_dt, '0001-01-01'::date) IS NULL)
                            Rows Removed by Filter: 2018
                            ->  Seq Scan on public.insurance1 i  (cost=0.00..90993.64 rows=336264 width=14) (actual time=0.002..284.366 rows=341073 loops=1)
                                  Output: i.is1_num, i.is1_ins_full, i.is1_set, i.is1_prim, i.is1_stay, i.is1_direct, i.is1_admit, i.is1_disc, i.is1_from, i.is1_to, i.is1_app_sent, i.is1_app_rec, i.is1_gen_dt, i.is1_ck_dt, i.is1_contract_old, i.is1_bill_dt, i.is1_pd_dt1, i.is1_pd_amt1, i.is1_pd_dt2, i.is1_pd_amt2, i.is1_pd_dt3, i.is1_pd_amt3, i.is1_pd_status, i.is1_med_ness, i.is1_bl_furn, i.is1_bl_rep, i.is1_bl_n_rep, i.is1_bl_rate, i.is1_bl_chg, i.is1_bl_non, i.is1_coamt, i.is1_expay, i.is1_semi_rate, i.is1_diem, i.is1_cd, i.is1_typrm, i.is1_qty, i.is1_chg, i.is1_non, i.is1_opcd, i.is1_opdate, i.is1_opdesc, i.is1_opchg, i.is1_opphy, i.is1_totchg, i.is1_totnon, i.is1_maxlines, i.is1_totlines, i.is1_apc_sw, i.is1_eapc_reim, i.is1_eapc_copay, i.is1_eapc_contr, i.is1_eapc_ded, i.is1_eapc_outlr, i.is1_aapc_reim, i.is1_aapc_copay, i.is1_aapc_contr, i.is1_aapc_ded, i.is1_aapc_outlr, i.is1_mr_okayed, i.is1_mr_ok_init, i.is1_a_ded, i.is1_dayfull, i.is1_dayco, i.is1_daylife, i.is1_bl_ded, i.is1_bl_d_amt, i.is1_contr2, i.is1_copay, i.is1_colim, i.is1_covrmrate, i.is1_lifeused, i.is1_covdays, i.is1_nondays, i.is1_co_used, i.is1_typary, i.is1_effdt, i.is1_covrate, i.is1_endcare, i.is1_drg, i.is1_p_name, i.is1_birth, i.is1_print, i.is1_mndays, i.is1_reim_amt, i.is1_elecbill, i.is1_elecdate, i.is1_ppscode, i.is1_psrocode, i.is1_typebill, i.is1_new_op_medicare, i.is1_netreim, i.is1_serv, i.is1_printcross, i.is1_crossdate, i.is1_filetype, i.is1_releaseinfo, i.is1_pdtype1, i.is1_pdtype2, i.is1_pdtype3, i.is1_coins, i.is1_otprim, i.is1_otins, i.is1_otset, i.is1_coveragesetby, i.is1_covcalctype, i.is1_covverified, i.is1_netcalc, i.is1_orig_expay, i.is1_orig_bl_d_amt, i.is1_orig_coamt, i.is1_orig_a_ded, i.is1_orig_drg, i.is1_orig_covdays, i.is1_excrep, i.is1_exccov, i.is1_genviaautocl, i.is1_autobill, i.is1_formver, i.is1_contract, i.is1_origin, i.is1_covgoverride, i.is1_covglogname, i.is1_covgcsnum, i.is1_excchgs, i.is1_arid, i.is1_drg_icd10
                            ->  Hash  (cost=14837.16..14837.16 rows=2312 width=19) (actual time=394.738..394.738 rows=2416 loops=1)
                                  Output: vsub.visit_id, vsub.visit_ins, vsub.visit_arid, vsub.visit_disch_date
                                  Buckets: 1024  Batches: 1  Memory Usage: 123kB
                                  ->  Seq Scan on public.visit vsub  (cost=0.00..14837.16 rows=2312 width=19) (actual time=0.097..391.839 rows=2416 loops=1)
                                        Output: vsub.visit_id, vsub.visit_ins, vsub.visit_arid, vsub.visit_disch_date
                                        Filter: ((NULLIF(vsub.visit_disch_date, '0001-01-01'::date) IS NOT NULL) AND ((vsub.visit_servicecd_key)::text !~ '^[0-9]+$|[FHJbKkN\\+]'::text) AND ("left"((vsub.visit_id)::text, 1) <> ALL ('{F,P,J}'::text[])) AND (vsub.visit_mr_fdt >= (now() - '50 days'::interval)) AND (vsub.visit_mr_fdt <= (now() + '10 days'::interval)))
                                        Rows Removed by Filter: 201579
          ->  Index Scan using inscomp1_pkey on public.insurance_company_table1 inscomp1  (cost=0.00..0.33 rows=1 width=11) (actual time=0.029..0.031 rows=1 loops=418)
                Output: inscomp1.inscomp1_comp, inscomp1.inscomp1_name, inscomp1.inscomp1_addr1, inscomp1.inscomp1_addr2, inscomp1.inscomp1_city, inscomp1.inscomp1_state, inscomp1.inscomp1_zip, inscomp1.inscomp1_form, inscomp1.inscomp1_jour, inscomp1.inscomp1_prov, inscomp1.inscomp1_typebill, inscomp1.inscomp1_perdiem, inscomp1.inscomp1_diemdt, inscomp1.inscomp1_priordiem, inscomp1.inscomp1_pcused, inscomp1.inscomp1_phybillsw, inscomp1.inscomp1_approval, inscomp1.inscomp1_psro, inscomp1.inscomp1_ask_reim, inscomp1.inscomp1_reject_days, inscomp1.inscomp1_contrgl, inscomp1.inscomp1_othargl, inscomp1.inscomp1_auto_ip, inscomp1.inscomp1_auto_op, inscomp1.inscomp1_prirate, inscomp1.inscomp1_secrate, inscomp1.inscomp1_lagdays, inscomp1.inscomp1_mul1500, inscomp1.inscomp1_waitfin, inscomp1.inscomp1_opsum, inscomp1.inscomp1_write_off, inscomp1.inscomp1_transmit, inscomp1.inscomp1_detail_bl, inscomp1.inscomp1_provname, inscomp1.inscomp1_provaddr1, inscomp1.inscomp1_provaddr2, inscomp1.inscomp1_provcity, inscomp1.inscomp1_provst, inscomp1.inscomp1_provzip, inscomp1.inscomp1_cont_perc, inscomp1.inscomp1_drg_str, inscomp1.inscomp1_drg_acps, inscomp1.inscomp1_drg_dt, inscomp1.inscomp1_ub_loc2, inscomp1.inscomp1_sub_id, inscomp1.inscomp1_bcno, inscomp1.inscomp1_taxno, inscomp1.inscomp1_mcareno, inscomp1.inscomp1_mcaidno, inscomp1.inscomp1_signature, inscomp1.inscomp1_prov2, inscomp1.inscomp1_allpay, inscomp1.inscomp1_discrate, inscomp1.inscomp1_discdt, inscomp1.inscomp1_discrate2, inscomp1.inscomp1_sumitem, inscomp1.inscomp1_bank_plan, inscomp1.inscomp1_phy_1500, inscomp1.inscomp1_labmc, inscomp1.inscomp1_orer_comb, inscomp1.inscomp1_phybillex, inscomp1.inscomp1_net_reim, inscomp1.inscomp1_ask_drg, inscomp1.inscomp1_provphone, inscomp1.inscomp1_drggroup, inscomp1.inscomp1_phyins, inscomp1.inscomp1_phyub82, inscomp1.inscomp1_mnxsnf, inscomp1.inscomp1_srcpmtcde, inscomp1.inscomp1_provname2, inscomp1.inscomp1_coll_id, inscomp1.inscomp1_diagptr, inscomp1.inscomp1_tax, inscomp1.inscomp1_type_desc, inscomp1.inscomp1_revcode, inscomp1.inscomp1_arid
                Index Cond: ((inscomp1.inscomp1_arid = vsub.visit_arid) AND ((inscomp1.inscomp1_comp)::text = (vsub.visit_ins)::text))
                Filter: (now() >= (vsub.visit_disch_date + ((inscomp1.inscomp1_lagdays)::double precision * '1 day'::interval)))
  ->  CTE Scan on vc  (cost=0.00..0.02 rows=1 width=12) (actual time=404.481..1008.180 rows=414 loops=1)
        Output: vc.visit_id
  ->  Index Scan using vdiag_pkey on public.visit_diagnosis vd  (cost=0.00..9774.48 rows=1 width=71) (actual time=14.246..30.882 rows=1 loops=414)
        Output: vd.vdiag_visit_key, vd.vdiag_arxseq, vd.vdiag_element, vd.vdiag_type, vd.vdiag_date, vd.vdiag_diag, vd.vdiag_arid
        Index Cond: ((vd.vdiag_visit_key)::text = (vc.visit_id)::text)
        Filter: ((vd.vdiag_type)::text = 'A'::text)
        Rows Removed by Filter: 1
Total runtime: 13796.881 ms

EXPLAIN (ANALYZE, VERBOSE) FOR JUST GETTING THE ID'S THAT TOOK MILLISECONDS

QUERY PLAN
Nested Loop  (cost=14871.84..109244.42 rows=1 width=7) (actual time=403.659..985.167 rows=432 loops=1)
  Output: vsub.visit_id
  ->  Nested Loop  (cost=14871.84..109243.06 rows=4 width=24) (actual time=403.618..980.483 rows=441 loops=1)
        Output: vsub.visit_id, vsub.visit_ins, vsub.visit_arid, vsub.visit_disch_date, da.da152_arid
        Join Filter: ((vsub.visit_disch_date <= da.da152_chgdt) AND (vsub.visit_arid = da.da152_arid))
        Rows Removed by Join Filter: 1351
        ->  Index Scan using da152_pkey on public.da152 da  (cost=0.00..10.40 rows=4 width=9) (actual time=0.004..0.014 rows=4 loops=1)
              Output: da.da152_rel_file_key_num, da.da152_contrl, da.da152_rcdt, da.da152_chgdt, da.da152_ar_rev, da.da152_ar_rc, da.da152_ar_rcar, da.da152_unpost, da.da152_lrun, da.da152_ar_month, da.da152_insdt, da.da152_otc_rcdt, da.da152_sum_rev, da.da152_sum_rc, da.da152_sum_rcar, da.da152_begbal, da.da152_netar, da.da152_arend, da.da152_eomtrial, da.da152_rc_da, da.da152_rc_amt_otc, da.da152_rc_amt_cp, da.da152_chg_thru, da.da152_ar_ldom, da.da152_csnum, da.da152_sec_e_bill, da.da152_uform, da.da152_jform, da.da152_rate, da.da152_contract_bill, da.da152_up_front, da.da152_bad_debt, da.da152_pat_refund, da.da152_2cnd_ins, da.da152_pat_type1, da.da152_pat_type2, da.da152_pat_srvcd1_old, da.da152_pat_srvcd2_old, da.da152_wo_colcd1, da.da152_wo_alpha1s, da.da152_wo_alpha1e, da.da152_wo_colcd2, da.da152_wo_alpha2s, da.da152_wo_alpha2e, da.da152_wo_colcd3, da.da152_wo_alpha3s, da.da152_wo_alpha3e, da.da152_ins_prim, da.da152_tic_xfc1, da.da152_tic_xfc2, da.da152_ins_fc, da.da152_wo_sitnum, da.da152_wo_sit_cd, da.da152_ins2dt, da.da152_multicoqual, da.da152_multicoid, da.da152_multicodir, da.da152_multicoglnum, da.da152_pat_srvcd1, da.da152_pat_srvcd2, da.da152_adchg, da.da152_adchg_var, da.da152_month_closedt, da.da152_pat_colcd, da.da152_nonaracct, da.da152_isfup_days, da.da152_arid
        ->  Materialize  (cost=14871.84..109231.85 rows=12 width=19) (actual time=100.901..244.629 rows=448 loops=4)
              Output: vsub.visit_id, vsub.visit_ins, vsub.visit_arid, vsub.visit_disch_date
              ->  Hash Right Join  (cost=14871.84..109231.79 rows=12 width=19) (actual time=403.596..976.869 rows=448 loops=1)
                    Output: vsub.visit_id, vsub.visit_ins, vsub.visit_arid, vsub.visit_disch_date
                    Hash Cond: (((i.is1_num)::text = (vsub.visit_id)::text) AND ((i.is1_ins_full)::text = (vsub.visit_ins)::text))
                    Filter: (NULLIF(i.is1_bill_dt, '0001-01-01'::date) IS NULL)
                    Rows Removed by Filter: 2018
                    ->  Seq Scan on public.insurance1 i  (cost=0.00..90994.67 rows=336267 width=14) (actual time=0.002..283.076 rows=341075 loops=1)
                          Output: i.is1_num, i.is1_ins_full, i.is1_set, i.is1_prim, i.is1_stay, i.is1_direct, i.is1_admit, i.is1_disc, i.is1_from, i.is1_to, i.is1_app_sent, i.is1_app_rec, i.is1_gen_dt, i.is1_ck_dt, i.is1_contract_old, i.is1_bill_dt, i.is1_pd_dt1, i.is1_pd_amt1, i.is1_pd_dt2, i.is1_pd_amt2, i.is1_pd_dt3, i.is1_pd_amt3, i.is1_pd_status, i.is1_med_ness, i.is1_bl_furn, i.is1_bl_rep, i.is1_bl_n_rep, i.is1_bl_rate, i.is1_bl_chg, i.is1_bl_non, i.is1_coamt, i.is1_expay, i.is1_semi_rate, i.is1_diem, i.is1_cd, i.is1_typrm, i.is1_qty, i.is1_chg, i.is1_non, i.is1_opcd, i.is1_opdate, i.is1_opdesc, i.is1_opchg, i.is1_opphy, i.is1_totchg, i.is1_totnon, i.is1_maxlines, i.is1_totlines, i.is1_apc_sw, i.is1_eapc_reim, i.is1_eapc_copay, i.is1_eapc_contr, i.is1_eapc_ded, i.is1_eapc_outlr, i.is1_aapc_reim, i.is1_aapc_copay, i.is1_aapc_contr, i.is1_aapc_ded, i.is1_aapc_outlr, i.is1_mr_okayed, i.is1_mr_ok_init, i.is1_a_ded, i.is1_dayfull, i.is1_dayco, i.is1_daylife, i.is1_bl_ded, i.is1_bl_d_amt, i.is1_contr2, i.is1_copay, i.is1_colim, i.is1_covrmrate, i.is1_lifeused, i.is1_covdays, i.is1_nondays, i.is1_co_used, i.is1_typary, i.is1_effdt, i.is1_covrate, i.is1_endcare, i.is1_drg, i.is1_p_name, i.is1_birth, i.is1_print, i.is1_mndays, i.is1_reim_amt, i.is1_elecbill, i.is1_elecdate, i.is1_ppscode, i.is1_psrocode, i.is1_typebill, i.is1_new_op_medicare, i.is1_netreim, i.is1_serv, i.is1_printcross, i.is1_crossdate, i.is1_filetype, i.is1_releaseinfo, i.is1_pdtype1, i.is1_pdtype2, i.is1_pdtype3, i.is1_coins, i.is1_otprim, i.is1_otins, i.is1_otset, i.is1_coveragesetby, i.is1_covcalctype, i.is1_covverified, i.is1_netcalc, i.is1_orig_expay, i.is1_orig_bl_d_amt, i.is1_orig_coamt, i.is1_orig_a_ded, i.is1_orig_drg, i.is1_orig_covdays, i.is1_excrep, i.is1_exccov, i.is1_genviaautocl, i.is1_autobill, i.is1_formver, i.is1_contract, i.is1_origin, i.is1_covgoverride, i.is1_covglogname, i.is1_covgcsnum, i.is1_excchgs, i.is1_arid, i.is1_drg_icd10
                    ->  Hash  (cost=14837.16..14837.16 rows=2312 width=19) (actual time=393.928..393.928 rows=2416 loops=1)
                          Output: vsub.visit_id, vsub.visit_ins, vsub.visit_arid, vsub.visit_disch_date
                          Buckets: 1024  Batches: 1  Memory Usage: 123kB
                          ->  Seq Scan on public.visit vsub  (cost=0.00..14837.16 rows=2312 width=19) (actual time=0.088..391.163 rows=2416 loops=1)
                                Output: vsub.visit_id, vsub.visit_ins, vsub.visit_arid, vsub.visit_disch_date
                                Filter: ((NULLIF(vsub.visit_disch_date, '0001-01-01'::date) IS NOT NULL) AND ((vsub.visit_servicecd_key)::text !~ '^[0-9]+$|[FHJbKkN\\+]'::text) AND ("left"((vsub.visit_id)::text, 1) <> ALL ('{F,P,J}'::text[])) AND (vsub.visit_mr_fdt >= (now() - '50 days'::interval)) AND (vsub.visit_mr_fdt <= (now() + '10 days'::interval)))
                                Rows Removed by Filter: 201581
  ->  Index Scan using inscomp1_pkey on public.insurance_company_table1 inscomp1  (cost=0.00..0.33 rows=1 width=11) (actual time=0.007..0.008 rows=1 loops=441)
        Output: inscomp1.inscomp1_comp, inscomp1.inscomp1_name, inscomp1.inscomp1_addr1, inscomp1.inscomp1_addr2, inscomp1.inscomp1_city, inscomp1.inscomp1_state, inscomp1.inscomp1_zip, inscomp1.inscomp1_form, inscomp1.inscomp1_jour, inscomp1.inscomp1_prov, inscomp1.inscomp1_typebill, inscomp1.inscomp1_perdiem, inscomp1.inscomp1_diemdt, inscomp1.inscomp1_priordiem, inscomp1.inscomp1_pcused, inscomp1.inscomp1_phybillsw, inscomp1.inscomp1_approval, inscomp1.inscomp1_psro, inscomp1.inscomp1_ask_reim, inscomp1.inscomp1_reject_days, inscomp1.inscomp1_contrgl, inscomp1.inscomp1_othargl, inscomp1.inscomp1_auto_ip, inscomp1.inscomp1_auto_op, inscomp1.inscomp1_prirate, inscomp1.inscomp1_secrate, inscomp1.inscomp1_lagdays, inscomp1.inscomp1_mul1500, inscomp1.inscomp1_waitfin, inscomp1.inscomp1_opsum, inscomp1.inscomp1_write_off, inscomp1.inscomp1_transmit, inscomp1.inscomp1_detail_bl, inscomp1.inscomp1_provname, inscomp1.inscomp1_provaddr1, inscomp1.inscomp1_provaddr2, inscomp1.inscomp1_provcity, inscomp1.inscomp1_provst, inscomp1.inscomp1_provzip, inscomp1.inscomp1_cont_perc, inscomp1.inscomp1_drg_str, inscomp1.inscomp1_drg_acps, inscomp1.inscomp1_drg_dt, inscomp1.inscomp1_ub_loc2, inscomp1.inscomp1_sub_id, inscomp1.inscomp1_bcno, inscomp1.inscomp1_taxno, inscomp1.inscomp1_mcareno, inscomp1.inscomp1_mcaidno, inscomp1.inscomp1_signature, inscomp1.inscomp1_prov2, inscomp1.inscomp1_allpay, inscomp1.inscomp1_discrate, inscomp1.inscomp1_discdt, inscomp1.inscomp1_discrate2, inscomp1.inscomp1_sumitem, inscomp1.inscomp1_bank_plan, inscomp1.inscomp1_phy_1500, inscomp1.inscomp1_labmc, inscomp1.inscomp1_orer_comb, inscomp1.inscomp1_phybillex, inscomp1.inscomp1_net_reim, inscomp1.inscomp1_ask_drg, inscomp1.inscomp1_provphone, inscomp1.inscomp1_drggroup, inscomp1.inscomp1_phyins, inscomp1.inscomp1_phyub82, inscomp1.inscomp1_mnxsnf, inscomp1.inscomp1_srcpmtcde, inscomp1.inscomp1_provname2, inscomp1.inscomp1_coll_id, inscomp1.inscomp1_diagptr, inscomp1.inscomp1_tax, inscomp1.inscomp1_type_desc, inscomp1.inscomp1_revcode, inscomp1.inscomp1_arid
        Index Cond: ((inscomp1.inscomp1_arid = vsub.visit_arid) AND ((inscomp1.inscomp1_comp)::text = (vsub.visit_ins)::text))
        Filter: (now() >= (vsub.visit_disch_date + ((inscomp1.inscomp1_lagdays)::double precision * '1 day'::interval)))
        Rows Removed by Filter: 0
Total runtime: 985.603 ms
9
  • have you tried running EXPLAIN SELECT ... to see the query plan? I would try using an 'ANY' to see what happens. like SELECT t.id_field, t2.field1 FROM table t LEFT JOIN table2 t2 ON t2.t_id_field = t.id_field WHERE t.id_field = ANY( SELECT t.id_field from table t JOIN . . . JOIN . . . WHERE <10 rows of filtering criteria> ) Commented Nov 18, 2016 at 23:11
  • 1
    Please edit your question and add the execution plans generated by explain (analyze, verbose). Formatted text please, no screen shots Commented Nov 18, 2016 at 23:17
  • alternatively if your table t is really big, you can create a temporary version of it, index it, then use that instead Commented Nov 18, 2016 at 23:18
  • I tried the ANY and I received an "out of memory for query result response" after 1 minute. Commented Nov 18, 2016 at 23:25
  • 1
    And what about the plan for the query that only takes 766ms? Commented Nov 18, 2016 at 23:39

2 Answers 2

1

A complete rewrite of your query:

  • only retaining the visit and diagnosis tables in the outer query
  • moving the rest of the tables (which are only used as conditions) into EXISTS() phrases
  • rewriting the NULLIF(a,b) IS NULL to a > b, since b appears to be a N/A sentinel value (I would suggest -infinity for this, or just NULL)
  • AND changed one LEFT JOIN ... NULL to a NOT EXISTS(...) construct.
  • untested since I dont have the table definitions

WITH vCriteria AS (
    SELECT vs.visit_id, vd.vdiag_diag  
    FROM visit vs
    LEFT JOIN visit_diagnosis vd ON vd.vdiag_visit_key = vs.visit_id
    WHERE EXISTS ( SELECT *  
        FROM da152 da 
        WHERE da.da152_arid = vs.visit_arid 
        AND vs.visit_disch_date <= da.da152_chgdt
        )
    AND EXISTS ( SELECT *
        FROM insurance_company_table1 ict  
        WHERE ict.inscomp1_comp = vs.visit_ins
        AND ict.inscomp1_arid = vs.visit_arid
        AND Now() >= (vs.visit_disch_date + ict.inscomp1_lagdays * INTERVAL '1 day')
        )
    AND NOT EXISTS (
        SELECT * 
        FROM insurance1 ins 
        WHERE ins.is1_num = vs.visit_id AND ins.is1_ins_full = vs.visit_ins
        AND ins.is1_bill_dt > '0001-01-01'::date
        )
    AND vs.visit_mr_fdt
      BETWEEN (now() - 50 * INTERVAL '1 day') AND (now() + 10 * INTERVAL '1 day')
    AND vs.visit_disch_date > '0001-01-01'::date
    AND vs.visit_servicecd_key ~ '^[0-9]+$|[FHJbKkN\\+]' = false
    AND LEFT (vs.visit_id, 1) NOT IN ('F', 'P', 'J') 
) 
SELECT vCriteria.* FROM vCriteria
        ;
Sign up to request clarification or add additional context in comments.

Comments

0

This doesn't answer why, but I hope it helps lead to some insight. I've found that an in list:

select a from t1 where x in (select y from t2)

is fine when the in-list is low cardinality, but as the size of the list grows, it becomes less efficient. By contrast, a sem-join:

select a
from t1
where exists (
  select null
  from t2.x = t2.y
  where t1.
)

is almost always efficient, for big or small lists.

Out of laziness, I've used the in-list on my first hack and have nearly consistently regretted it. When I switch to a semi-join, the world is as it should be.

So, can you try to change your query to this and see what happens?

select t.id_field, t2.field1 from table t
LEFT JOIN table2 t2 ON t2.t_id_field = t.id_field
WHERE exists (
  select 1
  from table tt
  join . . .
  join . . .
  where
    t.id_field = tt.id_field and
     <10 rows of filtering criteria>
 )

And to be clear, the difference between a semi-join and a join is that if the in-list contains repeats, a semi-join ignores subsequent occurrences. Not only does it not multiply row results, but it "stops looking" for each match.

But as to your primary question -- can you get this down to the original three seconds? I doubt it, because I suspect that inner query is where the additional nine seconds came from.

-- EDIT --

This may be more code than you're willing to deal with to save nine seconds, but what happens when you run this?

CREATE OR REPLACE FUNCTION test()
  RETURNS TABLE (
    id_field varchar,
    field1   varchar
  ) AS
$BODY$
  DECLARE
    id_field_list varchar[];
  BEGIN
    SELECT array_agg (t.id_field)
    into id_field_list
    from table t
    JOIN . . .
    JOIN . . .    
    WHERE <10 rows of filtering criteria>;

    return query
    select t.id_field, t2.field1 from table t
    LEFT JOIN table2 t2 ON t2.t_id_field = t.id_field
    WHERE t.id_field = any(id_field_list);
  END
$BODY$
LANGUAGE plpgsql;

and of course, you would get the query results from:

select * from test();

3 Comments

The Exists way is taking over 20 seconds. You mentioned the inner query is taking the additional nine seconds, but it runs in less than a second on it's own, so why is it taking so much longer? Logically the filtering criteria takes less than a second and querying table2 with the ID's manually typed takes 3 seconds, so 4 seconds would be reasonable to think we could query the data at least.
That is curious... and unexpected. I added a function to my answer, just as a for-instance, to see if it gets the results you expect. It should essentially be doing that you did by hand, only in one step. Even if it gives the desired results, it doesn't explain your fundamental question, and again, it may be more than you're willing to deal with to shave off a few seconds.
Thank you for the function, but I only have read-only access which is making this more difficult. I did update my original post with a dummied down query that shows the speed problem so you can see how I am really writing my query and make sure I am not doing something strange.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.