0

I'm trying to build a query for the following scenario:

I have two tables Table1 and Table2.

The primary keys of Table1 goes like T1Attr1, T1Attr2 and so on.

Corresponding to each primary key in Table1, I can get a set of attributes from Table2 which goes like T2Attr1, T2Attr2 and so on.

I'm trying to query for the attributes which are common to the attributes of Table1, for example, if the input is T1Attr1 and T1Attr2, the results should have the attributes common to both of them from Table2. As the input parameters grows, the results would be less since common-to-all attributes would be less.

My query is similar to this:

Select indId, indName from indData where pId =1

intersect
Select indId, indName from indData where pId =2

intersect
Select indId, indName from indData where pId =3

The query works fine but when the pId list is huge(above 100), jdbc driver throws an error message.

Can someone please provide suggestions on using this query correctly or provide a better approach for the problem?

Thanks!

6
  • What is the primary key of this table? And what error message do you get exactly? Commented Jul 1, 2013 at 19:40
  • Primary key is an incremented value and not included in the query. The error I get is pasted here-> pastebin.com/VsG92F2n Commented Jul 1, 2013 at 20:07
  • No other unique constraints on the table? Commented Jul 1, 2013 at 20:37
  • 1
    Out of curiosity, did this error was produced with the intersection of 100 tables? Commented Jul 1, 2013 at 20:44
  • No other unique constraints, and yes I got that error when when the pId was from 1 to 100(so 100 intersections) Commented Jul 1, 2013 at 20:46

2 Answers 2

3

You can use this query but it won't be as efficient as the one you have:

SELECT indId, indName 
FROM indData 
WHERE pId IN (1, 2, 3, ..., 100)
GROUP BY indId, indName
HAVING COUNT(DISTINCT pId) = 100 ;  -- the number of pId you are searching on

You can also use JOINs. Perhaps this will result in a better execution plan and not cause this error. If there is a unique constraint on (indId, pId) this will be equivalent to your query:

SELECT a1.indId, a1.indName 
FROM indData AS a1
  JOIN indData AS a2
    ON a2.indId = a1.indId
  JOIN indData AS a3
    ON a3.indId = a1.indId 
  ...
  JOIN indData AS a100
    ON a100.indId = a1.indId
WHERE a1.pId = 1
  AND a2.pId = 2
  ...
  AND a100.pId = 100 ;

An index on (pId, indID) INCLUDE (indName) would help efficiency.

Sign up to request clarification or add additional context in comments.

Comments

2

Intersect is not the only way to do what you want. Your query is an example of a "set-within-sets" query. The "set" is the indid, indname pair. The "within-sets" are having all three values for pid.

I advocate using aggregation with a having clause for this type of query, because this is a very flexible approach for many types of conditions. In your case, the results query is:

select indid, indname
from indData
group by indid, indname
having SUM(case when pid = 1 then 1 else 0 end) > 0 and
       SUM(case when pid = 2 then 1 else 0 end) > 0 and
       SUM(case when pid = 3 then 1 else 0 end) > 0;

If you have an index on pid and the values are relatively rare, than adding a where pid in (1, 2, 3) could benefit the query performance-wise.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.