2

Suppose I have these two tables, simplified for the purpose of the question:

CREATE TABLE merchandises
(
  id BIGSERIAL PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  price INT NOT NULL
)

CREATE TABLE gifts
(
  id BIGSERIAL NOT NULL PRIMARY KEY,
  from_user VARCHAR(255) REFERENCES users(id),
  to_user VARCHAR(255) REFERENCES users(id),
  with_merchandise BIGINT REFERENCES merchandises(id)
)

The merchandises table lists available merchandises. The gifts table show records that a user has sent a merchandise to another user as gift (proper index is in place to avoid duplication).

What I would like to query is a list of merchandises that a user can send to another user, provided that the merchandises should not have been gifted before.

This is a query that works, but I hope that I can find one that does not have a nested query, thinking that it might give better performance thanks to the optimizer of POSTGRESQL.

SELECT DISTINCT ON (m.id) m.id, m.name, m.description
FROM merchandises m
WHERE m.id NOT IN (
    SELECT g.with_merchandise
    FROM gifts g
    WHERE g.from_user = 'some_user_id' AND g.to_user = 'some_other_user_id'
)
ORDER BY m.id ASC
LIMIT 20 OFFSET 0

In the previous attempt, I had this query, but I found out that it does not work:

SELECT DISTINCT ON (m.id) m.id, m.name, m.description
FROM merchandises m
LEFT JOIN gifts g
ON m.id = g.with_merchandise
WHERE g.id IS NULL 
OR g.from_user <> 'some_user_id' AND g.to_user <> 'some_other_user_id'
ORDER BY m.id ASC
LIMIT 20 OFFSET 0

This query does not work because even though the WHERE clause filters out gift entries from two specific users, two other users might have given gifts with the same merchandise (same merchandise_id).

2 Answers 2

2

Even though you asked to remove the subquery, using a not exists subquery might run faster than not in especially if the not in query returns a lot of values:

SELECT m.id, m.name, m.description
FROM merchandises m
WHERE NOT EXISTS (
    SELECT 1
    FROM gifts g
    WHERE g.with_merchandise = m.id
    AND g.from_user = 'some_user_id'
    AND g.to_user = 'some_other_user_id'
)

This query can take advantage of a composite key on gifts(with_merchandise,from_user,to_user)

If you still rather use left join, then move your conditions for from_user and to_user from the where to the on clause

SELECT m.id, m.name, m.description
FROM merchandises m
LEFT JOIN gifts g ON m.id = g.with_merchandise
  AND g.from_user = 'some_user_id' AND g.to_user = 'some_other_user_id' 
WHERE g.id IS NULL 
ORDER BY m.id ASC
LIMIT 20 OFFSET 0
Sign up to request clarification or add additional context in comments.

3 Comments

I don't necessarily have to use JOIN. However I don't know much about SQL, hence the concern. Of the two methods you proposed, which one would you recommend?
@Khanetor I prefer not exists because it expresses the query's intention better than left join and believe it should also run faster - but the only way to know for sure is to run both queries against your data
Thanks @FuzzyTree, I'll check them out.
2

This uses a left outer join and should perform well.

SELECT m.*
FROM merchandises m
LEFT OUTER JOIN (SELECT with_merchandise FROM gifts WHERE from_user = 'some_user_id' AND to_user = 'some_other_user_id' GROUP BY with_merchandise) g ON m.id = g.with_merchandise
WHERE g.with_merchandise IS NULL
ORDER BY m.id ASC
LIMIT 20 OFFSET 0

2 Comments

This is still a nested join, so I wonder how is it better (faster, more optimized) than my query above.
After thinking about your solution for a while, I think this works. However you don't need the GROUP BY because for a pair of users, there is only 1 with_merchandise ;).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.