write generic function that calls generic functions in scala

Question

I'm using Spark Datasets to read in csv files. I wanted to make a polymorphic function to do this for a number of files. Here's the function:

def loadFile[M](file: String):Dataset[M] = {
    import spark.implicits._
    val schema = Encoders.product[M].schema
    spark.read
      .option("header","false")
      .schema(schema)
      .csv(file)
      .as[M]
}

The errors that I get are:

[error] <myfile>.scala:45: type arguments [M] do not conform to method product's type parameter bounds [T <: Product]
[error]     val schema = Encoders.product[M].schema
[error]                                  ^
[error] <myfile>.scala:50: Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._  Support for serializing other types will be added in future releases.
[error]       .as[M]
[error]          ^
[error] two errors found

I don't know what to do about the first error. I tried adding the same variance as the product definition (M <: Product), but then I get the error "No TypeTag available for M"

If I pass in the schema already produced from the encoder, I then get the error:

[error] Unable to find encoder for type stored in a Dataset

Yuval Itzchakov · Accepted Answer · 2017-07-26 09:50:26Z

3

You need to require anyone calling loadFile[M] to provide evidence that there is such an encoder for M. You can do this by using context bounds on M which requires an Encoder[M]:

def loadFile[M : Encoder](file: String): Dataset[M] = {
  import spark.implicits._
  val schema = implicitly[Encoder[M]].schema
  spark.read
   .option("header","false")
   .schema(schema)
   .csv(file)
   .as[M]
}

edited Jul 26, 2017 at 9:50

answered Jul 26, 2017 at 9:27

Yuval Itzchakov

150k32 gold badges276 silver badges333 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

kim Over a year ago

Thanks! That definitely compiled, but I had some access problems and out of memory problem running my program, even if I don't call the function. I assume I can make my case class extend Encoder and it should work if I didn't have these other runtime problems?

Yuval Itzchakov Over a year ago

@kim This is a compile time requirement, this shouldn't affect the runtime at all. Perhaps something else is causing your code to OOM.

kim Over a year ago

I decided to get around the whole Encoder problem by not using Spark, but I did find this issue, which talks about encoders for custom objects. I'll come back to figuring it out when I have some time. I'll mark this as my answer though since it got me on the right track.

Collectives™ on Stack Overflow

write generic function that calls generic functions in scala

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related