1

I have a dataframe like below

+---+------------+----------------------------------------------------------------------+
|id |indexes     |arrayString                                                           |
+---+------------+----------------------------------------------------------------------+
|2  |1,3         |[WrappedArray(3, Str3), WrappedArray(1, Str1)]                        |
|1  |2,4,3       |[WrappedArray(2, Str2), WrappedArray(3, Str3), WrappedArray(4, Str4)] |
|0  |1,2,3       |[WrappedArray(1, Str1), WrappedArray(2, Str2), WrappedArray(3, Str3)] |
+---+------------+----------------------------------------------------------------------+

i want to loop through arrayString and get the first element as index and second element as String. Then replace the indexes with String corresponding to the index in arrayString. I want an output like below.

+---+---------------+
|id |replacedString |
+---+---------------+
|2  |Str1,Str3      |
|1  |Str2,Str4,Str3 |
|0  |Str1,Str2,Str3 |
+---+---------------+

I tried using the below udf function.

  val replaceIndex = udf((itemIndex: String, arrayString: Seq[Seq[String]]) => {
      val itemIndexArray = itemIndex.split("\\,")
    arrayString.map(i => {
      itemIndexArray.updated(i(0).toInt,i(1))
    })
    itemIndexArray
  })

This is giving me error and i am not getting my desired output. Is there any other way to achieve this. I cant use explode and join as i want the indexes replaced with string without losing the order.

.

0

1 Answer 1

1

You can create an udf as below to get the required result, Convert to the Array of array to map and find the index as a key in map.

val replaceIndex = udf((itemIndex: String, arrayString: Seq[Seq[String]]) => {
  val indexList = itemIndex.split("\\,")
  val array = arrayString.map(x => (x(0) -> x(1))).toMap
  indexList map array mkString ","
})

dataframe.withColumn("arrayString", replaceIndex($"indexes", $"arrayString"))
.show( false)

Output:

+---+-------+--------------+
|id |indexes|arrayString   |
+---+-------+--------------+
|2  |1,3    |Str1,Str3     |
|1  |2,4,3  |Str2,Str4,Str3|
|0  |1,2,3  |Str1,Str2,Str3|
+---+-------+--------------+

Hope this helps!

Sign up to request clarification or add additional context in comments.

3 Comments

Yeah, this answers help me. Can you please explain what is happening in this line indexList map array mkString ","
That is equivalent to indexList.map(array).mkString(",") and indexList.map(i => array.get(i)).mkString(",") I think this makes you easier to understand
Yes. Thanks for the answer and explanation

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.