How to write a T-SQL Script for different Source DB/Table versions

Question

My main question: How do I write an insert-SQL query that can handle if the source table doesn't include all the Fields it might have. (see the "a little background" section for reasons)

so my destination Table has a known set of fields, but my source table(s) might lack some of those. Like this:

Destination.TableA
----------
ID
Field 1
Field 2
Field 3

source1.TableA
-----
ID
Field 2

source2.TableA
-------------
ID
Field 1
Field 2

so what I basically need is an equivalent to "if field ... exists, then insert into...." (primary keys should not be affected)

I know how to test for the existence of a table or other SQL object, but I don't know how to do that in an insert-statement concerning the tables' fields. - Can you give me a hint?

A little background: We have a lot of customers running different Versions of our Product. Over time, our database has been expanded, field by field. Some Customer DBs have been customized to the personal needs of those customers.

Right now, I'm in the concept stage on developing a tool/script to Upgrade all those old Versions into our current Database Version. As, due to refactoring of our latest version, there is a lot of patching/moving of data due for this upgrade, I'd like to build a script that is able to upgrade all the versions of source databases. - This is mostly to avoid redundant adjustments while developing the script.

As I already have tools in place to generate SQL-queries, the amount of generated query code for me is secondary to its functionality.

p.s. After this upgrade, we have already built-in upgrade mechanics in place to ease upgrading in the future. This script is meant to get all our pre2018 up to date to get them into our new product/upgrade cycle.

p.p.s. all Databases run on Microsoft SQL Servers

If you really want to do this using SQL scripts then you could use Dynamic SQL? — Richard Hansell
– Richard Hansell, Commented Mar 30, 2020 at 8:28
Maybe have one script to do data insertions with the full set of columns which gets run after another script that adds the new columns in advance, along the lines of if not exists(select 1 from sys.columns where object_id=object_id('dbo.TableA') and name='Field2') then alter table dbo.TableA add Field2 someDataType someConstraints; for each of the required columns. — AlwaysLearning
– AlwaysLearning, Commented Mar 30, 2020 at 8:37
Well, then, I think AlwaysLearning suggestion is a pretty good solution - it might be simpler than attempting to build a different insert...select statement for each customer. — Zohar Peled
– Zohar Peled, Commented Mar 30, 2020 at 11:51

Richard Hansell · Accepted Answer · 2020-03-30 09:20:27Z

Okay, this is going to be quite a messy answer, as there's a lot of things going on here to get this to work in a reasonably sensible way. I'm using MERGE with dynamic SQL to generate scripts then run them for each table.

First I need some test data, so I created a Temp database to play with:

USE Temp;
GO
CREATE SCHEMA destination;
GO
CREATE SCHEMA source1;
GO
CREATE SCHEMA source2
GO
CREATE TABLE destination.tableA (
    ID INT,
    Field1 VARCHAR(50),
    Field2 VARCHAR(50),
    Field3 VARCHAR(50));
GO
CREATE TABLE source1.tableA (
    ID INT,
    Field2 VARCHAR(50));
GO
CREATE TABLE source2.tableA (
    ID INT,
    Field1 VARCHAR(50),
    Field2 VARCHAR(50));
GO

Then I added some test data, making this repeatable:

DELETE FROM destination.tableA;
DELETE FROM source1.tableA;
DELETE FROM source2.tableA;
INSERT INTO source1.tableA SELECT 1, 'dog';
INSERT INTO source1.tableA SELECT 2, 'cat';
INSERT INTO source2.tableA SELECT 1, 'dog', 'harold';
INSERT INTO source2.tableA SELECT 3, 'mouse', 'midge';
GO

The plan is to start with a totally empty destination table, and to have some cases where all the data comes from one table, and other cases where there's a mixture, e.g. one table provides one piece of data, put there's another column for the same primary key in another table. This makes things much more complex, as now we need to INSERT or UPDATE depending on the data. This is where MERGE comes in.

I also use FOR XML PATH to get comma-separated lists. Here's the "nasty" dynamic SQL:

IF OBJECT_ID('tempdb..#schemas') IS NOT NULL
    DROP TABLE #schemas;
GO
SELECT SCHEMA_NAME INTO #schemas FROM INFORMATION_SCHEMA.SCHEMATA WHERE CATALOG_NAME = 'Temp' AND SCHEMA_NAME LIKE 'source%';
WHILE EXISTS (SELECT * FROM #schemas)
BEGIN
    DECLARE @schema VARCHAR(50);
    SELECT TOP 1 @schema = SCHEMA_NAME FROM #schemas;
    SELECT @schema;
    DECLARE @sql NVARCHAR(4000);
    SELECT @sql = N'MERGE destination.tableA AS [target] USING (SELECT * FROM ' + QUOTENAME(@schema) + '.tableA) 
        AS [source] (' 
        + STUFF((SELECT ',' + COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_CATALOG = 'Temp' AND TABLE_SCHEMA = @schema AND TABLE_NAME = 'TableA' ORDER BY ORDINAL_POSITION FOR XML PATH('')), 1, 1, '')
        + N') ON [target].ID = [source].ID
        WHEN MATCHED THEN UPDATE SET '
        + STUFF((SELECT ',' + COLUMN_NAME + ' = [source].' + COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_CATALOG = 'Temp' AND TABLE_SCHEMA = @schema AND TABLE_NAME = 'TableA' AND COLUMN_NAME != 'ID' ORDER BY ORDINAL_POSITION FOR XML PATH('')), 1, 1, '')
        + N' WHEN NOT MATCHED THEN INSERT ('
        + STUFF((SELECT ',' + COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_CATALOG = 'Temp' AND TABLE_SCHEMA = @schema AND TABLE_NAME = 'TableA' ORDER BY ORDINAL_POSITION FOR XML PATH('')), 1, 1, '')
        + N')
        VALUES ('
        + STUFF((SELECT ', [source].' + COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_CATALOG = 'Temp' AND TABLE_SCHEMA = @schema AND TABLE_NAME = 'TableA' ORDER BY ORDINAL_POSITION FOR XML PATH('')), 1, 1, '')
        + ');';
        EXEC sp_executesql @sql;
        SELECT @sql;
        DELETE FROM #schemas WHERE SCHEMA_NAME = @schema;
END;
GO
SELECT * FROM [destination].tableA;

Here's an example of one of the dynamic SQL scripts that gets run:

MERGE destination.tableA AS [target] USING (SELECT * FROM [source1].tableA) 
AS [source] (ID,Field2) ON [target].ID = [source].ID
WHEN MATCHED THEN UPDATE SET Field2 = [source].Field2 WHEN NOT MATCHED 
THEN INSERT (ID,Field2)
VALUES ( [source].ID, [source].Field2);

The end result of this is:

ID  Field1  Field2  Field3
1   dog     harold  NULL
2   NULL    cat     NULL
3   mouse   midge   NULL

Which looks good to me?

However, I imagine you would need to make quite a few changes to get this to work with your environment. I'm hoping this gives you a few ideas on how this could be done?

Wow, you put it all in there, Thank you =) I don't particularly need merge. I can tell from the Table name if it needs to update or insert. But your way to put the dynamicSQL together is a lot elegant than I would've expected.

Collectives™ on Stack Overflow

How to write a T-SQL Script for different Source DB/Table versions

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related