0

I am a bit new to powerquery and need some help with the points specified below the code.

I have a piece of working code that does the following:

  1. Load the source
  2. find the max "series" of a row, the format is a mix of letters and numbers, i.e. Y21Q3S1, the letters stay the same and the numbers are increasing (year, quarter, and series).
  3. I want to look if a certain tag is assigned to a row, so I search all the tag columns if a tag is present and write that in the "Tags" column and "none" if there were none found
  4. through grouping I find the points per tag, for each "max series"
  5. I finally present it in a table in excel with first column being the series, then a column for the Tags as well as a column "None" if none of the Tags were present. I add a last updated date column.

The code:

    let
    Source = Csv.Document(Web.Contents("somefile.csv"),[Delimiter=",", Columns=32, Encoding=65001, QuoteStyle=QuoteStyle.None]),
    #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    #"Changed Type with Locale" = Table.TransformColumnTypes(#"Promoted Headers", {{"Custom field (Points)", type number}}, "en-GB"),
    #"Added Custom" = Table.AddColumn(#"Changed Type with Locale", "Max Series", each List.Max({[Series], [Series_1], [Series_2], [Series_3], [Series_4]})),
    Tags = Table.AddColumn(#"Added Custom", "Tags", each if List.Contains({[Tags], [Tags_7], [Tags_8], [Tags_9], [Tags_10], [Tags_11], [Tags_12], [Tags_13], [Tags_14], [Tags_15], [Tags_16], [Tags_17], [Tags_18], [Tags_19], [Tags_20], [Tags_21], [Tags_22], [Tags_23]}, "tag1") then "tag1"
else if List.Contains({[Tags], [Tags_7], [Tags_8], [Tags_9], [Tags_10], [Tags_11], [Tags_12], [Tags_13], [Tags_14], [Tags_15], [Tags_16], [Tags_17], [Tags_18], [Tags_19], [Tags_20], [Tags_21], [Tags_22], [Tags_23]}, "tag2") then "tag2"
else if List.Contains({[Tags], [Tags_7], [Tags_8], [Tags_9], [Tags_10], [Tags_11], [Tags_12], [Tags_13], [Tags_14], [Tags_15], [Tags_16], [Tags_17], [Tags_18], [Tags_19], [Tags_20], [Tags_21], [Tags_22], [Tags_23]}, "tag3") then "tag3"
else "zzzNone"),
    RemoveDummy = Table.SelectRows(Tags, each [ID] <> "ID-1234"),
    #"Grouped Rows" = Table.Group(RemoveDummy, {"Max Series", "Tags"}, {{"Points per Tags", each List.Sum([#"Custom field (Points)]), type number}}),
    #"Sorted Rows" = Table.Sort(#"Grouped Rows",{{"Tags", Order.Ascending}}),
    #"Pivoted Column" = Table.Pivot(#"Sorted Rows", List.Distinct(#"Sorted Rows"[Tags]), "Tags", "Points per Tags"),
    #"Renamed Columns" = Table.RenameColumns(#"Pivoted Column",{{"zzzNone", "None"}, {"Max Series", "Series"}}),
    #"Added Custom1" = Table.AddColumn(#"Renamed Columns", "Last update", each DateTime.LocalNow()),
    #"Changed Type1" = Table.TransformColumnTypes(#"Added Custom1",{{"Last update", type datetime}})
in
    #"Changed Type1"
  1. The "Series" and "Tags" columns are a multivariable field, containing all series and tags and is translated by excel into multiple columns. The issue is that the number of series and tags are changing and to try coping with this I have created a dummy row with a lot of series. However, as you can see from the code this also changes and somehow "Tags_2" to "Tags_6" has disappeared and I had to error correct by removing these from the code.
    • Is there a dynamic way to if any column "Tags_*" contains "tag1" then... so I don't have to hard-code this?
    • Same goes for the "Max Series" where I would like to dynamically take max value of any columns "Series_*"
  2. I would like to make the “Tags” step more dynamic, so that I can take input from a table in the excel sheet specifying which tags I want to search for instead of hardcoding “tag1, “tag2” etc.
  3. My current code only assigns the points to the first tag found. However, I would like to assign points to several tags, so if two tags were found the "points" be assigned with half to each and for 3 tags they would all get one third of the points. I don’t know how to do this. Could you help me here?
  4. As I am a bit new powerquery my code might be far from optimal, if you have some suggestions in your answers on how I can improve it that would be highly appreciated :-)

2 Answers 2

0

Hi mitru and welcome to StackOverflow!

You can make the 'Tags' step automatic by 'Unpivot other columns' and 'Group by' operations. To obtain this you should select all non Tag* columns and use 'Unpivot other columns'. Then please perform a Group by operation with operation = All Rows enter image description here

You will receive a column populated with tables. The next step is to create a Custom columns with following formula:

=if List.Contains([Tags][Value],"tag1") then "tag1"
else if List.Contains([Tags][Value],"tag2") then "tag2"
else if List.Contains([Tags][Value],"tag3") then "tag3" 
else "zzzNone"

The [Tags] is the column containing tables while [Value] is the column within each table that contains tags you are looking for.

Under below link there is a file with sample solution that I created. https://sendeyo.com/en/608c8dee7f

Regarding bullet 3. I am not sure how the scoring system should work. Can you provide a sample data with the final output?

Sign up to request clarification or add additional context in comments.

1 Comment

Hi Gonso, Thank you very much for the reply it was very helpful I was able to make it much more dynamic. I suppose it is not possible to do the same to also make the "series" step dynamic, as I would need to group and unpivot other in two different steps, where I am not sure exactly how many "seires" and "tags" columns I will have?
0

With the help from Gonso's post I was able to make the "tags" step more dynamic.

Furthermore, I found a solution on the bullet 3 assigning points to different tags if more than one tag is present.

I am posting the updated code here in case anyone else find the solution helpful:

let
    Source = Csv.Document(Web.Contents("somefile"),[Delimiter=",", Columns=50, Encoding=65001, QuoteStyle=QuoteStyle.None]),
    Promoted_Headers = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    Changed_Type_with_Locale = Table.TransformColumnTypes(Promoted_Headers, {{"Custom field (Points)", type number}}, "en-GB"),
    Max_Series = Table.AddColumn(Changed_Type_with_Locale, "Max Series", each List.Max({[Series], [Series_1], [Series_2], [Series_3], [Series_4]})),
    Unpivoted_Other_Columns = Table.UnpivotOtherColumns(Max_Series, {"Type", "ID", "Custom field (Points)", "Series", "Series_1", "Series_2", "Series_3", "Series_4", "Series_5", "Series_6", "Series_7", "Series_8", "Series_9", "Max Series"}, "Attribute", "Value"),
    Tags = Table.Group(Unpivoted_Other_Columns, {"Type", "ID", "Custom field (Points)", "Series", "Series_1", "Series_2", "Series_3", "Series_4", "Series_5", "Series_6", "Series_7", "Series_8", "Series_9", "Max Series"}, {{"Tags", each _, type table [Type=nullable text, ID=nullable text, #"Custom field (Points)"=nullable number, Series=nullable text, Series_1=nullable text, Series_2=nullable text, Series_3=nullable text, Series_4=nullable text, Series_5=nullable text, Series_6=nullable text, Series_7=nullable text, Series_8=nullable text, Series_9=nullable text, Max Series=text, Attribute=text, Value=text]}}),
    Tag1 = Table.AddColumn(Tags, "Tag1", each if List.Contains([Tags][Value],"Tag1") then 1
else 0),
    Tag2 = Table.AddColumn(Tag, "Tag2", each if List.Contains([Tags][Value],"tag2") then 1
else 0),
    Tag3 = Table.AddColumn(Tag2, "Tag3", each if List.Contains([Tags][Value],"tag3") then 1
else 0),
NoneTag = Table.AddColumn(Tag3, "TagNone", each if List.Sum({[Tag], [Tag2], [Tag3]}) > 0
then 0
else 1),
    Tag_total = Table.AddColumn(NoneTag, "Tag_total", each List.Sum({[Tag], [Tag2], [Tag3], [TagNone]})),
    Tag_update = Table.ReplaceValue(Tag_total,each [Tag1], each if [Tag1] > 0 then ([#"Custom field (Points)"] * ([Tag1] / [Tag_total])) else [Tag1],Replacer.ReplaceValue,{"Tag1"}),
    Tag2_update = Table.ReplaceValue(Tag_update, each [Tag2], each if [Tag2] > 0 then ([#"Custom field (Points)"] * ([Tag2] / [Tag_total])) else [Tag2],Replacer.ReplaceValue,{"Tag2"}),
    Tag3_update = Table.ReplaceValue(Tag2_update, each [Tag3], each if [Tag3] > 0 then ([#"Custom field (Points)"] * ([Tag3] / [Tag_total])) else [Tag3],Replacer.ReplaceValue,{"Tag3"}),
    TagNone_update = Table.ReplaceValue(Tag3_update, each [TagNone], each if [TagNone] > 0 then ([#"Custom field (Points)"] * ([TagNone] / [Tag_total])) else [TagNone],Replacer.ReplaceValue,{"TagNone"}),
    RemoveDummy = Table.SelectRows(TagNone_update, each [ID] <> "ID-1234"),
    Grouped_Series = Table.Group(RemoveDummy, {"Max Series"}, {{"Tag1", each List.Sum([Tag1]), type number}, {"Tag2", each List.Sum([Tag2]), type number}, {"Tag3", each List.Sum([Tag3]), type number}, {"None", each List.Sum([TagNone]), type nullable number}}),
    Sorted_Series = Table.Sort(Grouped_Series,{{"Max Series", Order.Ascending}}),
    Renamed_Series = Table.RenameColumns(Sorted_Series,{{"Max Series", "Series"}}),
    Added_last_update = Table.AddColumn(Renamed_Series, "Last update", each dateTime.LocalNow()),
    Changed_date_Type = Table.TransformColumnTypes(Added_last_update,{{"Last update", type datetime}})
in
    Changed_date_Type

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.