How to transform/unpivot data into two columns?

Question

I have the input data COL1 as shown below that has fields with their values. Some pairs fields/values are separated by = and others appear in table format but separated by text. I'm trying to get 2 columns as output, one with the fields and the other the corresponding values. For the rows containing =, the fields/values are in the same row but the values in "table format" only have their fields (headers=COUNTRY,CAPITAL,POPULATION,LANGUAGE) in first row before the values.

This is the input column and current output I have so far:

This is the desired output I'm trying to get that is like what "unpivot" would do with other kind of input:

This is my current code:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("lVDJcoQgEP0Vy3MO6sRlKpVDCx0kakOxHCbWVP7/L4KoI5mcwoV6zVuatyw5U+QkIbnsPUNvlMb8/rLkNwQTJlVRVREz5cmZW3YeBlo6mHaklfYTOKloRROQ8CA2K/aFbPg2qH0/SRbJBoTHw6gs6ktdFMVuu7KjjiPNYMY0MmxHQ/CNIU1Zvj5kGQeSdhP2OAnp50TYG28tTnYPLNNA7h0b3j4MUrgEmhkougyeBPz6ce85aLRuQ9dL257xG1vu2rRUsBISk+yp3vLP2/+qfirbDSDDgCdbh+dRHTU2dd1259rZyo9CSVwRprt+wgjGHbi6dl3RnMoHP/z4/gM=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [COL1 = _t]),
    DeleteMultiSpaces = Table.TransformColumns(Source, {{"COL1", each Text.Combine(List.Select(Text.SplitAny(_, " "), each _ <> "")," "), type text}}),
    AddAtSignAsSeparator = Table.ReplaceValue(
                          Table.ReplaceValue(DeleteMultiSpaces," = ","@",Replacer.ReplaceText,{"COL1"})
                          ," ","@",Replacer.ReplaceText,{"COL1"}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(AddAtSignAsSeparator, {{"COL1", Splitter.SplitTextByDelimiter("@", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "COL1")
in
    #"Split Column by Delimiter"

horseyride · Accepted Answer · 2023-01-04 16:21:33Z

Or

let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("lVDJcoQgEP0Vy3MO6sRlKpVDCx0kakOxHCbWVP7/L4KoI5mcwoV6zVuatyw5U+QkIbnsPUNvlMb8/rLkNwQTJlVRVREz5cmZW3YeBlo6mHaklfYTOKloRROQ8CA2K/aFbPg2qH0/SRbJBoTHw6gs6ktdFMVuu7KjjiPNYMY0MmxHQ/CNIU1Zvj5kGQeSdhP2OAnp50TYG28tTnYPLNNA7h0b3j4MUrgEmhkougyeBPz6ce85aLRuQ9dL257xG1vu2rRUsBISk+yp3vLP2/+qfirbDSDDgCdbh+dRHTU2dd1259rZyo9CSVwRprt+wgjGHbi6dl3RnMoHP/z4/gM=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [COL1 = _t]),
DeleteMultiSpaces = Table.TransformColumns(Source, {{"COL1", each Text.Combine(List.Select(Text.SplitAny(_, " "), each _ <> "")," "), type text}}),
#"Added Custom" = Table.AddColumn(DeleteMultiSpaces, "Custom", each Text.Contains([COL1],"=")),
#"Added Index" = Table.AddIndexColumn(#"Added Custom", "Index", 0, 1, Int64.Type),
#"Added Custom1" = Table.AddColumn(#"Added Index", "Custom.1", each Text.Split([COL1]," ")),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Custom.2", each {"COUNTRY","CAPITAL","POPULATION","LANGUAGE"}),
#"Added Custom3" = Table.AddColumn(#"Added Custom2", "Custom.3", each List.Repeat({[Index]},List.Count([Custom.1]))),
#"Filtered Rows" = Table.SelectRows(#"Added Custom3", each not Text.Contains([COL1], "POPULATION") and [Custom]=false),
Part1=Table.FromColumns({List.Combine(#"Filtered Rows"[Custom.3]),List.Combine(#"Filtered Rows"[Custom.2]),List.Combine(#"Filtered Rows"[Custom.1])}),

#"Filtered Rows2" = Table.SelectRows(#"Added Index", each ([Custom] = true)),
#"Split Column by Delimiter" = Table.SplitColumn(#"Filtered Rows2", "COL1", Splitter.SplitTextByDelimiter(" = ", QuoteStyle.Csv), {"Column2", "Column3"}),
#"Renamed Columns" = Table.RenameColumns(#"Split Column by Delimiter",{{"Index", "Column1"}}),
Part2 = Table.RemoveColumns(#"Renamed Columns",{"Custom"}),

combined = Table.Combine({Part1, Part2}),
#"Added Index2" = Table.AddIndexColumn(combined, "Index", 0, 1, Int64.Type),
#"Sorted Rows" = Table.Sort(#"Added Index2",{{"Column1", Order.Ascending}, {"Index", Order.Ascending}}),
#"Removed Columns" = Table.RemoveColumns(#"Sorted Rows",{"Column1", "Index"})
in #"Removed Columns"

Thanks for your help. I selected your solution since it works independently the text for each field and the number of fields

davidebacci · Accepted Answer · 2023-01-04 13:58:20Z

1

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("lVDJcoQgEP0Vy3MO6sRlKpVDCx0kakOxHCbWVP7/L4KoI5mcwoV6zVuatyw5U+QkIbnsPUNvlMb8/rLkNwQTJlVRVREz5cmZW3YeBlo6mHaklfYTOKloRROQ8CA2K/aFbPg2qH0/SRbJBoTHw6gs6ktdFMVuu7KjjiPNYMY0MmxHQ/CNIU1Zvj5kGQeSdhP2OAnp50TYG28tTnYPLNNA7h0b3j4MUrgEmhkougyeBPz6ce85aLRuQ9dL257xG1vu2rRUsBISk+yp3vLP2/+qfirbDSDDgCdbh+dRHTU2dd1259rZyo9CSVwRprt+wgjGHbi6dl3RnMoHP/z4/gM=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [COL1 = _t]),
    DeleteMultiSpaces = Table.TransformColumns(Source, {{"COL1", each Text.Combine(List.Select(Text.SplitAny(_, " "), each _ <> "")," "), type text}}),
    #"Added Custom" = Table.AddColumn(DeleteMultiSpaces, "Custom", each if Text.StartsWith([COL1],"CONTINENT") then [COL1] else null),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each if Text.StartsWith([COL1],"YEAR") then [COL1] else null),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Custom.2", each if Text.StartsWith([COL1], "YEAR") or  Text.StartsWith([COL1], "CONTINENT") then null else [COL1]),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"COL1"}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Removed Columns", {{"Custom", Splitter.SplitTextByDelimiter("=", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Custom"),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Custom", type text}}),
    #"Split Column by Delimiter1" = Table.ExpandListColumn(Table.TransformColumns(#"Changed Type", {{"Custom.1", Splitter.SplitTextByDelimiter("=", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Custom.1"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Custom.1", type text}}),
    #"Filled Down" = Table.FillDown(#"Changed Type1",{"Custom", "Custom.1"}),
    #"Filtered Rows" = Table.SelectRows(#"Filled Down", each [Custom.2] <> null),
    #"Split Column by Delimiter2" = Table.SplitColumn(#"Filtered Rows", "Custom.2", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Custom.2.1", "Custom.2.2", "Custom.2.3", "Custom.2.4"}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter2",{{"Custom.2.1", type text}, {"Custom.2.2", type text}, {"Custom.2.3", type text}, {"Custom.2.4", type text}}),
    #"Promoted Headers" = Table.PromoteHeaders(#"Changed Type2", [PromoteAllScalars=true]),
    #"Changed Type3" = Table.TransformColumnTypes(#"Promoted Headers",{{" EUROPE", type text}, {" 2022", Int64.Type}, {"COUNTRY", type text}, {"CAPITAL", type text}, {"POPULATION", type text}, {"LANGUAGE", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type3",{{" EUROPE", "CONTINENT"}, {" 2022", "YEAR"}}),
    #"Filtered Rows1" = Table.SelectRows(#"Renamed Columns", each [COUNTRY] <> "COUNTRY"),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Filtered Rows1", {}, "Attribute", "Value"),
    #"Removed Duplicates" = Table.Distinct(#"Unpivoted Columns"),
    #"Changed Type4" = Table.TransformColumnTypes(#"Removed Duplicates",{{"Value", type text}}),
    #"Trimmed Text" = Table.TransformColumns(#"Changed Type4",{{"Value", Text.Trim, type text}}),
    #"Renamed Columns1" = Table.RenameColumns(#"Trimmed Text",{{"Attribute", "Field"}})
in
    #"Renamed Columns1"

edited Jan 4, 2023 at 13:58

answered Jan 4, 2023 at 9:13

davidebacci

31.4k4 gold badges20 silver badges48 bronze badges

4 Comments

Rasec Malkic Over a year ago

Hi David, thanks for your help. Your solution is almost, almost correct. Not sure in your output, why from row 21 to 24, is adding the "headers" in both columns. How to fix that?

davidebacci Over a year ago

Sorry about that - I have updated now for you.

Rasec Malkic Over a year ago

Thanks for your time and solution. I selected horseyride's solution since it works independently the text in each field and the number of fields. But certainly the logic behaind your steps would help me as learning.

davidebacci Over a year ago

No problem. I'm fairly sure this solution can be genericised easily enough but the other solution is also good.

Collectives™ on Stack Overflow

How to transform/unpivot data into two columns?

2 Answers 2

1 Comment

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related