I'm using Oracle 11g and working on a legacy system. I have two tables, A and B. A has ~40M rows and B ~4M rows. A has a foreign key on B (B is the parent of A).
A query does essentially this:
select count(A.id), sum(A.something1), avg(B.something2), ....
from A inner join B on A.id_b = B.id
where A.filter_column in (...)
group by A.filter_column
condition on filter_column is such that 5-10% of rows from table A are selected. Join picks up 20-25% of rows from B. The end result is about 100 rows that contain aggregated data from selected rows from tables A and B.
The query initially had a hint in it, forcing Oracle to do a nested loops join of A to B. By removing the hint, execution plan changes from nested loops to a hash join. I would expect that to run faster, however, to my great surprise, it actually runs slower! Trying to find out why, I collected statistics on execution using one of Tom Kyte's scripts ("runstats") and found out that the nested loops version of the query did 0 (!) physical reads while the hash join version did a lot of them. I ran the test many times, changing the order of the execution of the two queries so that the nested loops version runs first and hash join version runs second and vice versa, but the results were always about the same. Furthermore, the hash join plan produces less logical I/O but is still slower.
Does anybody have an explanation for that?
Thank you!