Identify duplicates within columns using sql (oracle)

Question

I have the following dataset:

FLD_NB|RGN_CD
1     |NC
2     |SC
1     |MA
3     |GA
3     |MA

I am trying to identify all records which are available in more than 1 RGN_CD, so e.g. the scenario above, FLD_NB=1 is available in both RGN_CD='NC' and RGN_CD='MA'

What might be the best way to identify the rows which has multiple instances of FLD_NB across RGN_CD?

What about FLD_NB=3 , which should also be one of such candidate right ? — Sujitmohanty30
– Sujitmohanty30, Commented Sep 16, 2020 at 19:17

GMB · Accepted Answer · 2020-09-16 19:19:37Z

2

You can use group by and having:

select fld_nb
from mytable
group by fld_nb
having count(*) > 1

This gives you all fld_nbs that appear more than once. Or, if you want fld_nbs that have more than one distinct rgn_cd, you can change the having clause to:

having count(distinct rgn_cd) > 1

answered Sep 16, 2020 at 19:19

GMB

224k25 gold badges103 silver badges151 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

sourabh bhattacharya Over a year ago

i think this will work for what I am specifically looking for - i reduced the number of columns from the resultset and then I think I could get the right result using this query - I am still checking but so far looks good - thanks a ton

Sayan Malakshinov · Accepted Answer · 2020-09-17 09:32:57Z

0

Probably this is what you need:

select *
from (
     select t.*
           ,count(*)over(partition by FLD_NB) cnt
     from t
     )
where cnt>1;

Full tests case with results:

with t (FLD_NB,RGN_CD) as (
select 1, 'NC' from dual union all
select 2, 'SC' from dual union all
select 1, 'MA' from dual union all
select 3, 'GA' from dual union all
select 3, 'MA' from dual 
)
select *
from (
     select t.*
           ,count(*)over(partition by FLD_NB) cnt
     from t
     )
where cnt>1;

Results:

    FLD_NB RG        CNT
---------- -- ----------
         1 NC          2
         1 MA          2
         3 MA          2
         3 GA          2

In case if you need to count just distinct values:

select *
from (
     select t.*
           ,count(distinct RGN_CD)over(partition by FLD_NB) cnt
     from t
     )
where cnt>1;

edited Sep 17, 2020 at 9:32

answered Sep 16, 2020 at 22:08

Sayan Malakshinov

8,7501 gold badge21 silver badges28 bronze badges

3 Comments

sourabh bhattacharya Over a year ago

yes, that is the reultset I am expecting, however this table as usual has plenty of other columns along with these two main columns on which I have to identify the duplicates - i checked the sql by changing the column names to the main 2 columns however the resultset is still throwing more rows that I did not want to see ...e.g. if I have FLD_NB=22 and it has 7 rows with same RGN_CD=NA still those rows are returned which I don't want coz that FLD_NM=22 isn't repeating in any other RGN_CD

Sayan Malakshinov Over a year ago

@sourabhbhattacharya ah, ok, so just change count(*)over to count(distinct RGN_CD) over

Sayan Malakshinov Over a year ago

Great, so please mark the answer as a correct id it works for you

Collectives™ on Stack Overflow

Identify duplicates within columns using sql (oracle)

2 Answers 2

1 Comment

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related