I have a dataframe like this
+-------+------------------------+
|key | data|
+-------+------------------------+
| 61|[a -> b, c -> d, e -> f]|
| 71|[a -> 1, c -> d, e -> f]|
| 81|[c -> d, e -> f] |
| 91|[x -> b, y -> d, e -> f]|
| 11|[a -> a, c -> b, e -> f]|
| 21|[a -> a, c -> x, e -> f]|
+-------+------------------------+
I want to filter rows whose data column map contains the key 'a' and the value of key 'a' is 'a'. So the following dataframe is the desired output.
+-------+------------------------+
|key | data|
+-------+------------------------+
| 11|[a -> a, c -> b, e -> f]|
| 21|[a -> a, c -> x, e -> f]|
+-------+------------------------+
I tried casting the value to a map but I am getting this error
== SQL ==
Map
^^^
at org.apache.spark.sql.catalyst.parser.AstBuilder$$anonfun$visitPrimitiveDataType$1.apply(AstBuilder.scala:1673)
at org.apache.spark.sql.catalyst.parser.AstBuilder$$anonfun$visitPrimitiveDataType$1.apply(AstBuilder.scala:1651)
at org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:108)
at org.apache.spark.sql.catalyst.parser.AstBuilder.visitPrimitiveDataType(AstBuilder.scala:1651)
at org.apache.spark.sql.catalyst.parser.AstBuilder.visitPrimitiveDataType(AstBuilder.scala:49)
at org.apache.spark.sql.catalyst.parser.SqlBaseParser$PrimitiveDataTypeContext.accept(SqlBaseParser.java:13779)
at org.apache.spark.sql.catalyst.parser.AstBuilder.typedVisit(AstBuilder.scala:55)
at org.apache.spark.sql.catalyst.parser.AstBuilder.org$apache$spark$sql$catalyst$parser$AstBuilder$$visitSparkDataType(AstBuilder.scala:1645)
at org.apache.spark.sql.catalyst.parser.AstBuilder$$anonfun$visitSingleDataType$1.apply(AstBuilder.scala:90)
at org.apache.spark.sql.catalyst.parser.AstBuilder$$anonfun$visitSingleDataType$1.apply(AstBuilder.scala:90)
at org.apache.spark.sql.catalyst.parser.ParserUtils$.withOrigin(ParserUtils.scala:108)
at org.apache.spark.sql.catalyst.parser.AstBuilder.visitSingleDataType(AstBuilder.scala:89)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser$$anonfun$parseDataType$1.apply(ParseDriver.scala:40)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser$$anonfun$parseDataType$1.apply(ParseDriver.scala:39)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:98)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parseDataType(ParseDriver.scala:39)
at org.apache.spark.sql.Column.cast(Column.scala:1017)
... 49 elided
If I just want to filter based on the column 'key' I can just go by doing df.filter(col("key") === 61). But the problem is, the value is a Map.
Is there any thing like df.filter(col("data").toMap.contains("a") && col("data").toMap.get("a") === "a")