3

I'm doing a spark app using scala with following data:

+----------+--------------------+
|        id|                data|
+----------+--------------------+
|    id1   |[AC ED 00 05 73 7...|
|    id2   |[CF 33 01 61 88 9...|
+----------+--------------------+

The schema shows:

root
 |-- id: string (nullable = true)
 |-- data: binary (nullable = true)

I tried to convert this dataframe into a map object, with id being key and data being value

I have tried:

df.as[(String, BinaryType)].collect.toMap

but I got following error:

java.lang.UnsupportedOperationException: No Encoder found for org.apache.spark.sql.types.BinaryType
- field (class: "org.apache.spark.sql.types.BinaryType", name: "_2")
- root class: "scala.Tuple2"
1
  • 1
    Should be Array[Byte]. Commented Mar 11, 2020 at 0:45

1 Answer 1

3

BinaryType is a Spark DataType. It maps in Scala/Java to Array[Byte].

Try df.as[(String, Array[Byte])].collect.toMap.

Make sure you've imported your sessions implicits, e.g., import spark.implicits._ so you gain the ability to create Encoder[T] instances implicitly.

Sign up to request clarification or add additional context in comments.

3 Comments

I tried your suggestion, the error is: result.as[(String, Array[Byte])].collect.toMap java.lang.IllegalArgumentException: Unsupported class file major version 57 at org.apache.xbean.asm6.ClassReader.<init>(ClassReader.java:166)
You are likely running the wrong Java version. See stackoverflow.com/questions/53583199/…
Source: spark.apache.org/docs/latest/api/java/org/apache/spark/sql/… (The data type representing Array[Byte] values. Please use the singleton DataTypes.BinaryType.)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.