3

I have this table :

create table testtb (c1 number, c2 number);
insert into testtb values (1, 100);
insert into testtb values (2, 100);
insert into testtb values (3, 100);
insert into testtb values (3, 101);
insert into testtb values (4, 101);
insert into testtb values (5, 102);
commit; 

I'm struggling to come up with SQL query that would return the following result when where clause is this : "c2=100"

result set:

c1 c2
-- ---
1  100
2  100
3  100
3  101
4  101

The reason result set contains "3,101" is because it's reachable through "3,100". And same for "4,101" : reachable through -> "3,101" -> "3,100".

UPDATE: This table contains identifiers from 2 different data sets after similarity join. So the idea is to allow user to search by any identifier and show all possible matches between two datasets. That is why when user searches for "c2=100" I also want to show "3,101" and "4,101" to show full graph of matches.

Thanks.

4
  • 2
    Can you explain a little more what the rows represent? Why exactly does the 3,100 row add the other two items to the results? Commented Feb 23, 2011 at 22:01
  • 2
    Maybe give us what you have now so we can see what you are trying to do. Commented Feb 23, 2011 at 22:03
  • Are you looking for something like CONNECT BY? Is there a limit to the number of levels - presumably at least 2 levels to get the 4,101 entry. But it isn't very clear quite what you need... Commented Feb 23, 2011 at 22:07
  • @Alex Poole: that was my first attempt to use "connect by" but I could not figured out how. It is not a "parent-child" relationship. And recursion level is not known but I do not expect it to be greater the 5. Commented Feb 23, 2011 at 22:21

3 Answers 3

4
select distinct c1, c2
from testtb
connect by nocycle prior c1 = c1 or prior c2 = c2
start with c2 = 100
order by c1, c2;
Sign up to request clarification or add additional context in comments.

3 Comments

this is it! one more question if you don't mind. execution plan for this query against test table (with both column indexes) shows "full table scan". Is this because of "connect by" ?
Yes it probably has something to do with the connect by, I've had lots of performance problems with these types of queries. I added 100000 rows, built indexes on C1 and C2, and gathered stats, and 11gR2 only used the C2 index. Oddly, the full table scan cardinality estimate was 2(!) in the connect by query, but for a simple select count(*) from testtb the cardinality estimate was perfect. I was able to use both indexes by adding the hint /*+ dynamic_sampling(testtb, 4) */.
Thanks, I'll try to run query against real data with the hint and see how it goes.
3

Same idea as jonearles answer, but using recursive subquery factoring:

  WITH pathtb(c1,c2) AS
  (
  SELECT c1,c2 FROM testtb WHERE c2=100
  UNION ALL
  SELECT testtb.c1,testtb.c2 FROM
     testtb JOIN pathtb ON (pathtb.c1=testtb.c1 or pathtb.c2=testtb.c2)
  ) CYCLE c1,c2 set cycle TO 1 default 0
  SELECT DISTINCT c1,c2 FROM pathtb WHERE cycle=0
  ORDER BY c1,c2

2 Comments

+1 Nice query. It looks more complicated than connect by, but your query is more standard and runs several times faster.
agreed, nice query but Oracle (10/11)g does not support this syntax :(
1

Try a subquery... inferring this from your initial post, hope it helps.

select * from testtbl where c1 in (select c1 from testtbl where c2=100)

(I'm a MSSQL person so apologies if this doesn't map 100% to PL-SQL but you get the idea)

Edit: Sorry, I see you also want 4,101. Maybe two levels of subquery then?

    select *
    from testtbl
    where c2 in
    (select c2 from testtbl where c1 in (select c1 from testtbl where c2=100))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.