I have a query where I want to get all transactions for a specific user(owner table) in my database. The database is pretty normalized, so getting from transaction to owner traverses many tables. My tables with relevant foreign keys are as follows:
**owners**
-------
id
**store_shops**
-----------
id
owner_id
**service_shops**
-------------
id
owner_id
**products**
-------------
id
store_shop_id
**services**
------------
id
service_shop_id
**order_services**
------------------
id
service_id
order_id
**order_products**
------------------
id
product_id
order_id
**orders**
----------
id
transaction_id
**transactions**
----------------
id
refund_transaction_id
amount
I have the following query:
SELECT DISTINCT ON (sales.id) sales.id, sales.amount FROM transactions sales
LEFT OUTER JOIN transactions refunds ON refunds.id = sales.refund_transaction_id
LEFT OUTER JOIN orders ON orders.transaction_id = trans.id OR orders.transaction_id = refunds.id
LEFT OUTER JOIN order_services ON order_services.order_id = orders.id
LEFT OUTER JOIN order_products ON order_products.order_id = orders.id
LEFT OUTER JOIN products ON products.id = order_products.product_id
LEFT OUTER JOIN services ON services.id = order_services.service_id
LEFT OUTER JOIN service_shops ON service_shops.id = services.service_shop_id
LEFT OUTER JOIN store_shops ON store_shops.id = products.store_shop_id
LEFT OUTER JOIN owners service_shop_owners ON service_shop_owners.id = service_shops.owner_id
LEFT OUTER JOIN owners store_shop_owners ON store_shop_owners.id = store_shops.owner_id
WHERE (service_shop_owners.id = 26930 OR store_shop_owners.id = 26930)
This gives me the desired results. Only trouble is that on a dataset of hundreds of thousands of records, it becomes unusably slow.
I'm not very advanced when it comes to SQL, but I realize all the LEFT OUTER JOINs isn't very efficient.
Is there a better way for me to handle this query? Or am I going to have to denormalize the database a bit and store more info in the transaction table?
UPDATE Using Wyzard's answer below, I now have this query:
SELECT trans.id, trans.amount, refunds.id
FROM
service_shops
JOIN services ON services.service_shop_id = service_shop.id
JOIN order_services ON order_services.service_id = services_id
JOIN orders ON orders.id = order_services.order_id
JOIN transactions trans ON trans.id = orders.transaction_id
LEFT JOIN transactions refunds ON refunds.id = trans.refund_transaction_id
WHERE service_shops.owner_id = 26930
UNION
SELECT trans.id, trans.amount, refunds.id
FROM
store_shops
JOIN products ON store_shops.id = products.store_shop_id
JOIN order_products ON order_products.product_id = products.id
JOIN orders ON orders.id = order_products.order_id
JOIN transactions trans ON trans.id = orders.transaction_id
LEFT JOIN transactions refunds ON refunds.id = trans.refund_transaction_id
WHERE store_shops.owner_id = 2693
This is very fast and a big improvement. Only problem now is that the two LEFT JOIN transactions refunds ON refunds.id = trans.refund_transaction_id do not seem to be grabbing associated refund transactions. I'm assuming this is because they do not have an order associated directly with them, so the WHERE clause filters them out.
WHERE (service_shop_owners.id = 26930 OR store_shop_owners.id = 26930)will deteriorate at least two of the LEFT JOINS to plain JOINS. (which can be rewriten as EXISTS) (and the rest can probably be dropped since you only select from one tableFROM transactions salesLEFT OUTER JOIN store_shops ON store_shops.id = products.id— do these two tables really have the same IDs, or is that a mistake? (Comparing with theservice_shopsjoin, I'm guessing you might've meant something likestore_shops.id = products.store_shop_id.)