How to delete duplicate records based on two unique columns in RedShift [duplicate]

Question

I have the following table redshift.

guest_id	name	rownum
1	Safvan	1
1	Safvan	2
1	Thomas	3
2	Anandu	1
2	Manish	2

I need to delete all the records in each partition based on guest_id except the record having max(rownum).

The result should be like

guest_id	name	rownum
1	Thomas	3
2	Manish	2

Thanks in advance for valuable helps.

Please, show your current attempt and describe what is wrong with it. — astentx
– astentx, Commented Sep 16, 2021 at 7:09
that helped me somewhere..it is a good thread. I have posted my solution as answer. — safvan
– safvan, Commented Sep 16, 2021 at 8:21

Rahul Biswas · Accepted Answer · 2021-09-16 08:50:58Z

1

My solution is :

create table table_rownum as (
select
    *,
    row_number() over (partition by guest_id
order by
    rownum desc) as rownum_temp
from
    table_orig);

delete from table_rownum where rownum_temp<>1;

alter table table_rownum drop column rownum_temp;

truncate table table_orig;

insert into table_orig (select * from table_rownum);

drop table table_rownum;

Please suggest if there is better solution.

edited Sep 16, 2021 at 8:50

Rahul Biswas

3,5072 gold badges13 silver badges20 bronze badges

answered Sep 16, 2021 at 8:19

safvan

411 silver badge8 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Rahul Biswas Over a year ago

using CTE for removing duplicate records. Please check my answer. No need to create extra table and also perform drop or truncate. Only DELETE use for removing.

Rahul Biswas · Accepted Answer · 2021-09-16 09:36:04Z

0

Subquery returns guest_id wise max row then JOIN with main table where matching guest_id and max_row not equal row_num then perform DELETE.

DELETE redshift
FROM redshift r
INNER JOIN (SELECT guest_id
                 , MAX(rownum) rownum
           FROM redshift
           GROUP BY guest_id) t
        ON r.guest_id = t.guest_id
       AND r.rownum != t.rownum

Please check from https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=47081e3517000949460932808ac9f09d

Delete duplicate records by using CTE

WITH t_cte AS (
       SELECT *
            , ROW_NUMBER() OVER (PARTITION BY guest_id ORDER BY rownum DESC) row_num
       FROM redshift
)
DELETE redshift 
FROM t_cte c
INNER JOIN redshift r
        ON c.guest_id = r.guest_id
       AND  c.row_num > 1 AND c.rownum = r.rownum

Please check from url https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=90b7099ca779c0836b90278ae1b3635a

edited Sep 16, 2021 at 9:36

answered Sep 16, 2021 at 8:20

Rahul Biswas

3,5072 gold badges13 silver badges20 bronze badges

7 Comments

safvan Over a year ago

Here when using with cte, the delete will hit the original table redshift? or you missed to put original table in the main query?

safvan Over a year ago

WITH cte AS (        SELECT *             , ROW_NUMBER() OVER (PARTITION BY guest_id ORDER BY rownum DESC) row_num        FROM redshift ) DELETE FROM redshift rs inner join cte on rs.rownum=cte.row_num WHERE cte.row_num<>1;

This is the correct query?

Rahul Biswas Over a year ago

please check from given url where u can check data.

safvan Over a year ago

Redshift is basically built on Postgres, the query throwing error in Postgres.

Rahul Biswas Over a year ago

you can use first query.

|

Anil Kumar · Accepted Answer · 2021-09-16 07:16:44Z

-3

Delete from table name where rownum not in (select max(rownum) from table name groupby column)

answered Sep 16, 2021 at 7:16

Anil Kumar

1

2 Comments

astentx Over a year ago

It will delete all the data in the table if subquery returns more than one row.

Community Over a year ago

As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.

Collectives™ on Stack Overflow

How to delete duplicate records based on two unique columns in RedShift [duplicate]

3 Answers 3

1 Comment

7 Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

7 Comments

2 Comments

Linked

Related