1

So i just started working in PostgreSQL after some experience with Oracle and I have this query, that in Oracle returns in 200ms and in Postgres returns in 1.40 mins. The culprit seems to be

AND product_cost_view.product_type_id = product.product_type_id

When i remove this portion or hardcode product_cost_view.product_type_id with some ID, it runs fast. Explain plan didn't seem give and insight, it just says INDEX SCAN ON TABLE product TOTAL COST 776403 1913 ROWS.

Yes, product_cost_view is a view, I've also remarked that if i replace that view with a table that also has product_type_id then it also works fast. I tried using CTE and subselects in 100 different forms but when i use that product.product_type_id in the where clause with that view it just works hellish slow and i can't see what I miss. Thanks in advance :) P.S. Yes, i have the exact same data and indexes in both databases

SELECT COUNT(*)
FROM product
WHERE user_id = 1000000
  AND (product_id IN (SELECT DISTINCT product_id
                        FROM product_cost_view
                        WHERE user_id = 1000000
                          AND cost_type = 'X'
                          AND product_cost_view.product_type_id = product.product_type_id)
    );
2
  • 2
    Note that the distinct in the subquery is completely useless. Oracle might optimize it away, but I don't think Postgres will Commented Nov 11, 2020 at 16:58
  • 2
    Please edit your question and add the execution plan generated using explain (analyze, buffers, format text) (not just a "simple" explain) as formatted text and make sure you preserve the indention of the plan. Paste the text, then put ``` on the line before the plan and on a line after the plan. Please also include complete create index statements for all indexes as well. Commented Nov 11, 2020 at 16:58

2 Answers 2

1

Because of the DISTINCT, PostgreSQL cannot flatten the subquery into a join, so you are running the subquery for every row found in product.

Hard to say for certain without seeing the execution plan, but this should be faster:

SELECT COUNT(*)
FROM product AS p
WHERE p.user_id = 1000000
  AND EXISTS (SELECT 1 FROM product_cost_view AS pc
              WHERE pc.product_type_id = p.product_type_id
                AND pc.product_id = p.product_id
                AND pc.user_id = 1000000
                AND pc.cost_type = 'X');
Sign up to request clarification or add additional context in comments.

Comments

0

Could you try this variant:

SELECT COUNT(DISTINCT P.product_id)
FROM product P
INNER JOIN product_cost_view PC
    ON P.product_id = PC.product_id
    AND P.user_id = PC.user_id
    AND P.product_type_id = PC.product_type_id
WHERE P.user_id = 1000000
    AND PC.cost_type = 'X'

2 Comments

Thanks, it works fast now! Can you give any insight on why it was slow in the first form?
Well, its hard to tell without running the code in your environment, but from my practice I see a lot of cases, when the SQL Engine (SQL Server, PosgreSQL, Oracle) are not building the correct plans if the query is too complex. So, I just try to simplify it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.