1

I have a Play web app using Scala 2.11.8 and Spark "spark-core" % "2.2.0" and "spark-sql" % "2.2.0". I am trying to read a file that contains ratings of movies and do some transformation on it. When I use the function to split the tabs (movieLines.map(x => (x.split("\t")(1).toInt, 1))) I get an error that I guess it is because the guava lib dependency. I guess it is this because all the searches that I do on google it shows some fix based on this. But I cannot figure out how I exclude some guava dependencies.

Here is my code:

def popularMovies() = Action { implicit request: Request[AnyContent] =>
    Util.downloadSourceFile("downloads/ml-100k.zip", "http://files.grouplens.org/datasets/movielens/ml-100k.zip")
    Util.unzip("downloads/ml-100k.zip")

    val sparkContext = SparkCommons.sparkSession.sparkContext
    println("got sparkContext")

    val movieLines = sparkContext.textFile("downloads/ml-100k/u.data")
    println("popularMovies")
    println(movieLines)

    // Map to (movieID , 1) tuples
    val movieTuples = movieLines.map(x => (x.split("\t")(1).toInt, 1))
    println("movieTuples")
    println(movieTuples)

    // Count up all the 1's for each movie
    val movieCounts = movieTuples.reduceByKey((x, y) => x + y)
    println("movieCounts")
    println(movieCounts)

    // Flip (movieId, count) to (count, movieId)
    val movieCountFlipped = movieCounts.map(x => (x._2, x._1))
    println(movieCountFlipped)

    // Sort
    val sortedMovies = movieCountFlipped.sortByKey()
    println(sortedMovies)

    // collect and print the result
    val results = sortedMovies.collect().toList.mkString(",\n")
    println(results)

    Ok("[" + results + "]")
  }

and the error:

[error] application - 

! @76oh9h40m - Internal server error, for (GET) [/api/popularMovies] ->

play.api.http.HttpErrorHandlerExceptions$$anon$1: Execution exception[[RuntimeException: java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.mapred.FileInputFormat]]
    at play.api.http.HttpErrorHandlerExceptions$.throwableToUsefulException(HttpErrorHandler.scala:255)
    at play.api.http.DefaultHttpErrorHandler.onServerError(HttpErrorHandler.scala:180)
    at play.core.server.AkkaHttpServer$$anonfun$3.applyOrElse(AkkaHttpServer.scala:311)
    at play.core.server.AkkaHttpServer$$anonfun$3.applyOrElse(AkkaHttpServer.scala:309)
    at scala.concurrent.Future$$anonfun$recoverWith$1.apply(Future.scala:346)
    at scala.concurrent.Future$$anonfun$recoverWith$1.apply(Future.scala:345)
    at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
    at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
    at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
    at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
Caused by: java.lang.RuntimeException: java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.mapred.FileInputFormat
    at play.api.mvc.ActionBuilder$$anon$2.apply(Action.scala:424)
    at play.api.mvc.Action$$anonfun$apply$2.apply(Action.scala:96)
    at play.api.mvc.Action$$anonfun$apply$2.apply(Action.scala:89)
    at play.api.libs.streams.StrictAccumulator$$anonfun$mapFuture$2$$anonfun$1.apply(Accumulator.scala:174)
    at play.api.libs.streams.StrictAccumulator$$anonfun$mapFuture$2$$anonfun$1.apply(Accumulator.scala:174)
    at scala.util.Try$.apply(Try.scala:192)
    at play.api.libs.streams.StrictAccumulator$$anonfun$mapFuture$2.apply(Accumulator.scala:174)
    at play.api.libs.streams.StrictAccumulator$$anonfun$mapFuture$2.apply(Accumulator.scala:170)
    at scala.Function1$$anonfun$andThen$1.apply(Function1.scala:52)
    at scala.Function1$$anonfun$andThen$1.apply(Function1.scala:52)
Caused by: java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.mapred.FileInputFormat
    at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:312)
    at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:194)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
    at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
    at scala.Option.getOrElse(Option.scala:121)
1
  • I don't know what is causing the issue but I can tell you how to exclude a dependency from a library, in the build.sbt you declare a dependency with the following format "com.datastax.cassandra" % "cassandra-driver-core" % "3.3.2" exclude("io.netty", "*"), in this case I'm excluding anything matching the io.netty from the cassandra-driver-core library. Commented Feb 3, 2018 at 3:28

1 Answer 1

5

I added this dependency and it fixed my issue.

libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "2.7.2"

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you SO much! I was running into the same issue as you and thought I would have to figure out how to how to shade the guava deps for either Play or for Spark. And I was completely unable to figure out how to do so, because trying to deploy Play OR Spark alone in an uber jar raises a bunch of issues, never mind both together! How did you find this solution? Do you know why it works?
to the best of my memory, spark and scala have the guava library and the hadoop library overwrite it. I think so. I don't remember very well.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.