2

I have a table which have schema like this

id name

1 jack
2 jack of eden
3 eden of uk
4 m of s

I want to execute a query which gives me count of words like this

count word
2 jack
2 eden
3 of

this means jack has been here 2 times, eden 2 times and of has been 3 times.

Hope you got the question, m trying too but not getting the right query or approach to it

thnx

2
  • I should automatically show words which has occurrence more than 1 Commented Mar 5, 2012 at 9:23
  • same question stackoverflow.com/questions/881913/… Commented Jan 20, 2019 at 12:40

1 Answer 1

3

Assuming your table is named temp (probably not - change it to the right name of your table)

I used a subquery for finding all the words in your table:

select distinct regexp_substr(t.name, '[^ ]+',1,level) word , t.name, t.id
     from temp t
     connect by level <= regexp_count(t.name, ' ') + 1

this query splits all the words from all records. I aliased it words.
Then I joined it with your table (in the query it's called temp) and counted the number of occurences in every record.

select words.word, count(regexp_count(tt.name, words.word))
from(
select distinct regexp_substr(t.name, '[^ ]+',1,level) word , t.name, t.id
 from temp t
 connect by level <= regexp_count(t.name, ' ') + 1) words, temp tt
 where words.id= tt.id
 group by words.word

You can also add:

having count(regexp_count(tt.name, words.word)) > 1

update: for better performance we can replace the inner subquery with the results of a pipelined function:
first, create a schema type and a table of it:

create or replace type t is object(word varchar2(100), pk number);
/
create or replace type t_tab as table of t;
/

then create the function:

create or replace function split_string(del in varchar2) return t_tab
  pipelined is

  word    varchar2(4000);
  str_t   varchar2(4000) ;
  v_del_i number;
  iid     number;

  cursor c is
    select * from temp; -- change  to your table

begin

  for r in c loop
    str_t := r.name;
    iid   := r.id;

    while str_t is not null loop

      v_del_i := instr(str_t, del, 1, 1);

      if v_del_i = 0 then
        word  := str_t;
        str_t := '';
      else
        word  := substr(str_t, 1, v_del_i - 1);
        str_t := substr(str_t, v_del_i + 1);
      end if;

      pipe row(t(word, iid));

    end loop;

  end loop;

  return;
end split_string;

now the query should look like:

select words.word, count(regexp_count(tt.name, words.word))
from(
select word, pk as id from table(split_string(' '))) words, temp tt
 where words.id= tt.id
 group by words.word
Sign up to request clarification or add additional context in comments.

5 Comments

ok with Oracle 11g, but older versions don't come with REGEXP_COUNT. Otherwise: CONNECT BY length(regexp_susbstr(name, '[^ ]+',1,level)) >= 1
This is how I named your table in my test case :)
this is how I aliased the subquery that gives all the words, I'll add some explanation in my answer
@A.B.Cade the query is not responding.. my table has >800 rows
@A.B.Cade ur subquery taking very high time to respond.. any optimization we can do in it.. :) thnx

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.