2

Here's a link to an example of what I want to achieve: https://community.powerbi.com/t5/Desktop/Append-Rows-using-Another-columns/m-p/401836. Basically, I need to merge all the rows of a pair of columns into another pair of columns. How can I do this in Spark Scala?

Input

enter image description here

Output

enter image description here

1 Answer 1

1

Correct me if I'm wrong, but I understand that you have a dataframe with 4 columns and you want two of them to be in the previous couple of columns right?

For instance with this input (only two rows for simplicity)

df.show
+----+----------+-----------+----------+---------+
|name|     date1|      cost1|     date2|    cost2|
+----+----------+-----------+----------+---------+
|   A|2013-03-25|19923245.06|          |         |
|   B|2015-06-04| 4104660.00|2017-10-16|392073.48|
+----+----------+-----------+----------+---------+

With just a couple of selects and a unionn you can achieve what you want

df.select("name", "date1", "cost1")
  .union(df.select("name", "date2", "cost2"))
  .withColumnRenamed("date1", "date")
  .withColumnRenamed("cost1", "cost")

+----+----------+-----------+
|name|      date|       cost|
+----+----------+-----------+
|   A|2013-03-25|19923245.06|
|   B|2015-06-04| 4104660.00|
|   A|          |           |
|   B|2017-10-16|  392073.48|
+----+----------+-----------+
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.