How to transform one row into multiple columns in ADF?

Question

I have TABLE as a source with just 1 row, like this in Azure data factory with 336 columns:

1	2	3	4	5	6	7	8	9
value1	value2	value3	value4	value5	value6	value7	value8	value9

And a want to combine every 3 columns into the first 3:

1	2	3
value1	value2	value3
value4	value5	value6
value7	value8	value9

What is the alternative to using Select on every 3 columns and then Join as it is long process with this many columns?

The values are generic double numbers...no primary key,I just want to combine every 3 columns into the first 3 — Toni Vukasinovic
– Toni Vukasinovic, Commented May 12, 2021 at 13:01

wBob · Accepted Answer · 2021-05-11 16:49:12Z

If your datasource is Azure SQL DB, you could conventional SQL to transform the row with a combination of UNVPIVOT, PIVOT and some of the ranking functions to help group the data. A simple example:

DROP TABLE IF EXISTS #tmp;

CREATE TABLE #tmp (
    col1    VARCHAR(10),
    col2    VARCHAR(10),
    col3    VARCHAR(10),
    col4    VARCHAR(10),
    col5    VARCHAR(10),
    col6    VARCHAR(10),
    col7    VARCHAR(10),
    col8    VARCHAR(10),
    col9    VARCHAR(10)
);


INSERT INTO #tmp 
VALUES ( 'value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'value7', 'value8', 'value9' )



SELECT [1], [2], [0] AS [3]
FROM
    (
    SELECT
        NTILE(3) OVER( ORDER BY ( SELECT NULL ) ) nt,
        ROW_NUMBER() OVER( ORDER BY ( SELECT NULL ) ) % 3 groupNumber,
        newCol
    FROM #tmp
    UNPIVOT ( newCol for sourceCol In ( col1, col2, col3, col4, col5, col6, col7, col8, col9 ) ) uvpt
    ) x
PIVOT ( MAX(newCol) For groupNumber In ( [1], [2], [0] ) ) pvt;

Tweak the NTILE value depending on the number of columns you have - it should be the total number of columns you have divided by 3. For example if you have 300 columns, the NTILE value should be 100, if you have 336 columns it should be 112. A bigger example with 336 columns is available here.

Present the data to Azure Data Factory (ADF) either as a view or use the Query option in the Copy activity for example.

My results:

If you are using Azure Synapse Analytics then another fun way to approach this would be using Synapse Notebooks. With just three lines of code, you can get the table from the dedicated SQL pool, unpivot all 336 columns using the stack function and write it back to the database. This simple example is in Scala:

val df  = spark.read.synapsesql("someDb.dbo.pivotWorking")

val df2 = df.select( expr("stack(112, *)"))

// Write it back
df2.write.synapsesql("someDb.dbo.pivotWorking_after", Constants.INTERNAL)

I have to admire the simplicity of it.

Collectives™ on Stack Overflow

How to transform one row into multiple columns in ADF?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related