6

I am using below query to delete duplicates records from bigquery using standard sql. but it is throwing error

with cte as (
select * ,row_number()over (partition by CallRailCallId order by CallRailCallId) as rn
from `encoremarketingtest.EncoreMarketingTest.CallRailCall2` )

delete
 from cte
where rn>1
Query Failed
Error: Syntax error: Expected "(" or keyword SELECT but got keyword DELETE at [5:5]

Could anyone help me on the correct approach in BigQuery?

2
  • try delete from cte where rn>1; Commented May 25, 2018 at 9:44
  • tried but same error Commented May 25, 2018 at 10:45

1 Answer 1

10

Option #1

CREATE OR REPLACE TABLE `project.dataset.your_table` AS
SELECT * EXCEPT(rn)
FROM (
  SELECT *, ROW_NUMBER() OVER(PARTITION BY CallRailCallId ORDER BY CallRailCallId) rn
  FROM `project.dataset.your_table`
) 
WHERE rn = 1 

Option #2

CREATE OR REPLACE TABLE `project.dataset.your_table` AS
SELECT row.*
FROM (
  SELECT ARRAY_AGG(t ORDER BY CallRailCallId LIMIT 1)[OFFSET(0)] row
  FROM `project.dataset.your_table` t
  GROUP BY CallRailCallId
)   

As you might noticed, above options using DDL(CREATE TABLE) approach and that is where it is possible to use just one known (from your question) column - CallRailCallId
Also, note - ORDER BY CallRailCallId plays no real role there because GROUP BY and PARTITION BY are by exactly same filed. But if you change the field this will control which exactly row (out of few duplicates) to "survive" (For example ORDER BY ts DESC - see below option for what ts might be)

Option #3

This option uses DML(DELETE FROM) but requires some extra column to be used to serve as a tie-breaker

For example you have ts TIMESTAMP field and you want the most recent (based on ts) row to survive

DELETE FROM `project.dataset.your_table`
WHERE STRUCT(CallRailCallId, ts) NOT IN (
  SELECT AS STRUCT CallRailCallId, MAX(ts) ts
  FROM `project.dataset.your_table`
  GROUP BY CallRailCallId
  )
Sign up to request clarification or add additional context in comments.

2 Comments

@mayank - did you have chance to try?
Third option didnot work, 1st one work for me, I did not tried 2nd option

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.