2

I am uploading XML file data into a SQL Server database. When I import that same file a second time, all data rows get duplicated.

I tried to use DISTINCT when duplicated rows are removed, but when I am importing, data rows are still being duplicated.

How to skip duplicates while importing data into SQL Server database using DISTINCT method?

My table:

Create table HallSeat
(
    HallGroupID int,
    ShowSeatID int,
    Color nvarchar(15),
    Price int,
    SeatRow int,    
    SeatNumber int, 
    IsReserved bit
)

SQL DISTINCT statement:

SELECT DISTINCT * 
INTO tempdb.dbo.tmpTable
FROM HallSeat

DELETE FROM HallSeat

INSERT INTO HallSeat 
    SELECT * 
    FROM tempdb.dbo.tmpTable

DROP TABLE tempdb.dbo.tmpTable
2

1 Answer 1

2

You can use the T-SQL MERGE statement to do this. It will match the row set being imported with your HallSeat table. If the row doesn't exist, it will insert a new row. If the row does exist and there are differences, it can update it.

(You might not want to do the delete action, but I have included it for completeness.)

See Books Online > MERGE (Transact-SQL) -- https://msdn.microsoft.com/en-GB/library/bb510625.aspx

To demonstrate this, first create two tables.

CREATE TABLE dbo.HallSeat
(
    HallGroupID int NOT NULL,
    ShowSeatID int NOT NULL,
    Color nvarchar(15) NOT NULL,
    Price int NOT NULL,
    SeatRow int NOT NULL,
    SeatNumber int NOT NULL,
    IsReserved bit NOT NULL,
    CONSTRAINT PK_HallSeat PRIMARY KEY CLUSTERED (HallGroupID, ShowSeatID)
);

CREATE TABLE dbo.ImportHallSeat
(
    HallGroupID int NOT NULL,
    ShowSeatID int NOT NULL,
    Color nvarchar(15) NOT NULL,
    Price int NOT NULL,
    SeatRow int NOT NULL,
    SeatNumber int NOT NULL,
    IsReserved bit NOT NULL,
    CONSTRAINT PK_ImportHallSeat PRIMARY KEY CLUSTERED (HallGroupID, ShowSeatID)
);

Then import the XML data file into the ImportHallSeat table:

-- Read the XML data file to be imported
DECLARE @xml xml;
SELECT @xml = x.a
    FROM OPENROWSET(BULK 'F:\Work\Data.xml', SINGLE_BLOB) AS x(a);

TRUNCATE TABLE dbo.ImportHallSeat;

INSERT INTO dbo.ImportHallSeat(HallGroupID, ShowSeatID, Color, Price, SeatRow, SeatNumber, IsReserved)
    SELECT T.C.value('HallGroupID[1]', 'int') AS 'HallGroupID',
            T.C.value('ShowSeatID[1]', 'int') AS 'ShowSeatID',
            T.C.value('Color[1]', 'nvarchar(15)') AS 'Color',
            T.C.value('Price[1]', 'money') AS 'Price',
            T.C.value('SeatRow[1]', 'int') AS 'SeatRow',
            T.C.value('SeatNumber[1]', 'int') AS 'SeatNumber',
            T.C.value('IsReserved[1]', 'bit') AS 'IsReserved'
        FROM @xml.nodes(N'/Filharmonija/Hall/HallGroup/HallSeat') as T(C);

Then we can update the HallSeat table with the data being imported:

MERGE
    INTO dbo.HallSeat AS H
    USING dbo.ImportHallSeat AS I
    ON I.HallGroupID = H.HallGroupID AND I.ShowSeatID = H.ShowSeatID
    WHEN MATCHED AND H.Color <> I.Color AND H.Price <> I.Price
        THEN UPDATE SET H.Color = I.Color, H.Price = I.Price
    WHEN NOT MATCHED BY TARGET
        THEN INSERT (HallGroupID, ShowSeatID, Color, Price, SeatRow, SeatNumber, IsReserved)
            VALUES (I.HallGroupID, I.ShowSeatID, I.Color, I.Price, I.SeatRow, I.SeatNumber, I.IsReserved)
    WHEN NOT MATCHED BY SOURCE
        THEN DELETE;

Display the data that has been imported into the HallSeat table:

SELECT *
    FROM dbo.HallSeat;

enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

Big Thanks richard345! That was perfect answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.