1

I have an Input as below

id size
1 4
2 2

output - If input is 4 (size column) split 4 times(1-4) and if input size column value is 2 split it 1-2 times.

id size
1 1
1 2
1 3
1 4
2 1
2 2

3 Answers 3

3

You can create an array of sequence from 1 to size using sequence function and then to explode it:

import org.apache.spark.sql.functions._
val df = Seq((1,4), (2,2)).toDF("id", "size")
df
  .withColumn("size", explode(sequence(lit(1), col("size"))))
  .show(false)

The output would be:

+---+----+
|id |size|
+---+----+
|1  |1   |
|1  |2   |
|1  |3   |
|1  |4   |
|2  |1   |
|2  |2   |
+---+----+
Sign up to request clarification or add additional context in comments.

1 Comment

Ooooh nice, I didn't know about the sequence function! Better than my proposal, upvoted :)
0

You can use first use sequence function to create sequence from 1 to size and then explode it.

val df = input.withColumn("seq", sequence(lit(1), $"size"))
df.show()
+---+----+------------+
| id|size|         seq|
+---+----+------------+
|  1|   4|[1, 2, 3, 4]|
|  2|   2|      [1, 2]|
+---+----+------------+

df.withColumn("size", explode($"seq")).drop("seq").show()
+---+----+
| id|size|
+---+----+
|  1|   1|
|  1|   2|
|  1|   3|
|  1|   4|
|  2|   1|
|  2|   2|
+---+----+

1 Comment

How is this solution different from the one I suggested? (You don't have to create an additional column and then drop it)
0

You could turn your size column into an incrementing sequence using Seq.range and then explode the arrays. Something like this:

import spark.implicits._
import org.apache.spark.sql.functions.{explode, col}

// Original dataframe
val df = Seq((1,4), (2,2)).toDF("id", "size")

// Mapping over this dataframe: turning each row into (idx, array)
val df_with_array = df
  .map(row => {
    (row.getInt(0), Seq.range(1, row.getInt(1) + 1)) 
  })
  .toDF("id", "array")
  .select(col("id"), explode(col("array")))

output.show()
+---+---+
| id|col|
+---+---+
|  1|  1|
|  1|  2|
|  1|  3|
|  1|  4|
|  2|  1|
|  2|  2|
+---+---+

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.