2

I have a spark SQL question Id appreciate some guidance on the best way to do a conditional select from nested array of structs.

I have an example json document below

```

{
   "id":"p1",
   "externalIds":[
      {"system":"a","id":"1"},
      {"system":"b","id":"2"},
      {"system":"c","id":"3"}
    ]
}

```

In spark SQL I want to select the "id" of one of the array structs based on some conditional logic.

e.g for above, select the id field of array sub element that has "system" = "b", namely the id of "2".

How best to do this in SparkSQL?

Cheers and thanks!

4
  • Unless you explode the only option is UDF, which depends on the language you use. Commented May 3, 2017 at 10:17
  • thanks, Im using scala - I dont want to explode no Commented May 3, 2017 at 11:19
  • With Scala you can also convert to statically typed Dataset. Or like mentioned above, use udf. If you knew the index, you could also use it, but I assume you don't. Commented May 3, 2017 at 11:43
  • Have you considered accpeting my answer? Commented May 17, 2017 at 6:31

1 Answer 1

2

Using an UDF, this could look like this, given a Dataframe (all attributes of type String):

+---+---------------------+
|id |externalIds          |
+---+---------------------+
|p1 |[[a,1], [b,2], [c,3]]|
+---+---------------------+

Define an UDF to traverse your array and find the desired element:

def getExternal(system: String) = {
  udf((row: Seq[Row]) =>
    row.map(r => (r.getString(0), r.getString(1)))
      .find { case (s, _) => s == system}
      .map(_._2)
      .orElse(None)
  )
}

and use it like this:

df
  .withColumn("external",getExternal("b")($"externalIds"))
  .show(false)

+---+---------------------+--------+
|id |externalIds          |external|
+---+---------------------+--------+
|p1 |[[a,1], [b,2], [c,3]]|2       |
+---+---------------------+--------+
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.