17

As an example, I want to get the list of all items with certain tags applied to them. I could do either of the following:

SELECT Item.ID, Item.Name
FROM Item
WHERE Item.ID IN (
    SELECT ItemTag.ItemID
    FROM ItemTag
    WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55)

Or

SELECT Item.ID, Item.Name
FROM Item
LEFT JOIN ItemTag ON ItemTag.ItemID = Item.ID
WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55
GROUP BY Item.ID, Item.Name

Or something entirely different.

In general (assuming there is a general rule), what's a more efficient approach?

1
  • 1
    @Larsenal: you can replace a LEFT JOIN with an INNER JOIN in your second query, the results will be the same. A LEFT JOIN will return NULLs for the rows in ItemTag that do not have a corresponding Item.ID, and your WHERE condition will filter them out. Commented Jul 24, 2009 at 19:45

7 Answers 7

18
SELECT Item.ID, Item.Name
FROM Item
WHERE Item.ID IN (
    SELECT ItemTag.ItemID
    FROM ItemTag
    WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55)

or

SELECT Item.ID, Item.Name
FROM Item
LEFT JOIN ItemTag ON ItemTag.ItemID = Item.ID
WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55
GROUP BY Item.ID

Your second query won't compile, since it references Item.Name without either grouping or aggregating on it.

If we remove GROUP BY from the query:

SELECT  Item.ID, Item.Name
FROM    Item
JOIN    ItemTag
ON      ItemTag.ItemID = Item.ID
WHERE   ItemTag.TagID = 57 OR ItemTag.TagID = 55

these are still different queries, unless ItemTag.ItemId is a UNIQUE key and marked as such.

SQL Server is able to detect an IN condition on a UNIQUE column, and will just transform the IN condition into a JOIN.

If ItemTag.ItemID is not UNIQUE, the first query will use a kind of a SEMI JOIN algorithm, which are quite efficient in SQL Server.

You can trasform the second query into a JOIN:

SELECT  Item.ID, Item.Name
FROM    Item
JOIN    (
        SELECT DISTINCT ItemID
        FROMT  ItemTag
        WHERE  ItemTag.TagID = 57 OR ItemTag.TagID = 55
        ) tags
ON      tags.ItemID = Item.ID

but this one is a trifle less efficient than IN or EXISTS.

See this article in my blog for a more detailed performance comparison:

Sign up to request clarification or add additional context in comments.

Comments

4

I think it would depend on how the optimizer handles them, it may even be the case that you end up with the same performance. Display execution plan is your friend here.

Comments

2
SELECT Item.ID, Item.Name
...
GROUP BY Item.ID

This is not valid T-SQL. Item.Name must appear in the group by clause or within an aggregate function, such as SUM or MAX.

Comments

1

It's pretty much impossible (unless you're one of those crazy guru DBAs) to tell what will be fast and what won't without looking at the execution plan and/or running some stress tests.

2 Comments

In fact, it's easy to say: the second one is way faster. It will just refuse to compile in a nanosecond or so.
@Quassnoi Wouldn't that make it slower? It takes an infinite amount of time to return the result...
0

run this:

SET SHOWPLAN_ALL ON

then run each version of the query

you can see if they return the same plan, and if not look at the TotalSubtreeCost on the first row of each and see how different they are.

Comments

0

Performance always seems to get the vote, but you also hear "it is cheaper to buy hardware than programmers"

The second wins on performance.

Sometimes it is nice to look at SQL and know the purpose, but that's what comments are for. The first query is using the other table for a filter - pretty straight forward.

The second one would make more sense (from an understanding purpose and not performance) using distinct instead of group by. I would expect some aggregates to be in the select, but there aren't any. Speed kills.

Comments

0

The second one is more efficient in MySQL. MySQL will re-execute the query within the IN statement for every WHERE condition test.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.