3

I have a dataframe with schema which has a nested array of map values:

root
 |-- array_of_properties: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- name: string (nullable = true)
 |    |    |-- props: map (nullable = true)
 |    |    |    |-- key: string
 |    |    |    |-- value: string (valueContainsNull = true)

I need to filter on the struct name and some specific key's values in the map inside the array. I can filter on the name:

dataframe.filter(array_contains(col("array_of_properties.name"), "somename"))

How do I add AND filters on values of two keys in the nested props map (for example a key name is_enabled with a boolean value of true or false, and a key name of source with a string value of test) ?

0

1 Answer 1

3

Use exists function:

dataframe.filter("exists(array_of_properties, x -> x.name = 'somename' and x.props['is_enabled'] is true)")
Sign up to request clarification or add additional context in comments.

1 Comment

That works - thank you. Are there any drawbacks to using an exists filter ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.