Let us say I have this data frame.
df
line to_line priority 10 20 1 10 30 1 50 40 3 60 70 2 50 80 3
Based on the line and priority column values (when the are the same or duplicate as shown above), I want to combine to_line values. The proposed result should look like the following.
line to_line priority 10 20/30 1 50 40/80 3 60 70 2
I tried something like this but I couldn't get what I want.
df.groupBy(col("line")).agg(collect_list(col("to_line")) as "to_line").withColumn("to_line", concat_ws(",", col("to_line")))
Could you please help to figure out this? I appreciate your time and effort.