1
select docid  from A  where  docid IN ( select distinct(docid) from B)

When I execute above query in mysql it takes 33 seconds, which is too long as per the size of data.

Below is the details of both tables.

   Table A :
   | docid       | int(11)  | NO   | PRI | NULL    |       |
   Total number of entries = 500 (all entries are unique)

   Table B:
   | docid       | int(11)  | YES  |     | NULL    |       |
   Total number of entries = 66508
   (number of unique entries are 500)

   mysql version : 5.2

If I execute only select docid from A it will take 0.00 seconds, while select docid from B is taking 0.07 seconds.

Then why IN query with subquery takes 33 seconds? Am I doing something wrong?

3
  • What do you want to achieve with that query? Commented Aug 12, 2011 at 11:32
  • I am expecting this query should be executed in a second. then why this is taking too much time? Commented Aug 12, 2011 at 11:43
  • desc select docid from A where docid IN ( select distinct(docid) from B); -- the overhead is because of the number rows require to scan in order to match the IN() Commented Aug 12, 2011 at 11:58

2 Answers 2

6

The IN list is very large - 60K entries. You would be better to use a join:

select A.docid -- edited - I left out the A. :(
from A
join B on B.docid = A.docid;

This should execute very quickly and will give you the same result as your "IN" query.

Sign up to request clarification or add additional context in comments.

3 Comments

you need to put in the alias for select docid like select A.docid
Bohenian, i have edited my question because i used distinct in that query. when i execute "select distinct(docid) from B " it is taking 0.07 seconds only, then why it takes 33 seconds with IN query?
... because the separate queries can use the primary key index to find the matching rows but the subquery might get executed like a "for loop" where every matching row in the subquery from B is matched against A. The JOIN is more effectively optimised in MySQL at the moment, e.g. see technocation.org/content/oursql-episode-29%3A-subpar-subqueries and the MySQL manual dev.mysql.com/doc/refman/5.5/en/optimizing-subqueries.html
4

MySQL doesn't handle IN (subquery) well. It executes the inner query every single time the outer query is evaluated, rather than "remembering" the results.

Hence you are much better doing a join.

Other RDBMSes don't do this btw.

1 Comment

Thanks Brian, to give me a this much of clarity.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.