I have a col in a dataframe which is an array of structs. There are some structs with all null values which I would like to filter out. For example with the following dataframe:
+—————+————————————————---------------------—+
|advertiser |brands |
+—————+—————————————---------------------————+
Advertiser1 [{"id" : "a", "type" : "b", "name" : "c"}]
Advertiser2 [{"id" : null, "type" : null, "name" : null}]
+—————+————————————---------------------—————+
I would like to filter out the struct with the null values to get:
+—————+———————————————---------------------——+
|advertiser |brands |
+—————+————————————---------------------—————+
Advertiser1 [{"id" : "a", "type" : "b", "name" : "c"}]
Advertiser2 []
+—————+—————————————---------------------————+
I'm thinking it's something along the lines of this if I can come up with a struct of null values:
.withColumn(
"brands",
when(
col("brands").equalTo(*emptyStruct?*),
null
)
)