My query is:
SELECT category, count(*)
FROM large_data_table t
WHERE
(t.val IN (SELECT val FROM SMALL_TABLE WHERE id = 1))
OR
(t.val IN (SELECT val FROM SMALL_TABLE WHERE id = 2))
GROUP BY GROUPING SETS ((category), ());
The resulting execution plan is like this:
SELECT STATEMENT
SORT (GROUP BY ROLLUP)
FILTER
large_data_table TABLE ACCESS (FULL)
SMALL_TABLE TABLE ACCESS (BY INDEX ROWID)
SMALL_TABLE INDEX (RANGE SCAN)
SMALL_TABLE TABLE ACCESS (BY INDEX ROWID)
SMALL_TABLE INDEX (RANGE SCAN)
Where FILTER is, essentially, NESTED LOOP JOIN which never finishes as NESTED LOOPS often do on large tables. (the two subqueries on the small table return, say, 100 rows each). I am looking for a way to tell Oracle to NOT use the stupid FILTER here. I've tried sticking USE_HASH hint everywhere including some places I'd rather not talk about now.... GROUPING SETS seem to play a role in this, but getting rid of them is very difficult here. The one reasonable thing that works is to replace OR with a UNION, but it doesn't cover all cases either.
WHERE t.val IN (SELECT val FROM SMALL_TABLE WHERE id IN (1,2))should return the same thing.large_table(val, category)that should result in an index scan and an index onsmall_table (id, val)will probably help as well