To remove duplicate rows, I attempt this sql
val characters = MongoSpark.load[sparkSQL.Character](sparkSession)
characters.createOrReplaceTempView("characters")
val testsql = sparkSession.select("SELECT * FROM characters GROUP BY title")
testsql.show()
but this sql make this error message. if you know this problem, please answer this questin.
thanks you
Parsing command: SELECT * FROM characters GROUP BY title
Exception in thread "main" org.spache.spark.sql.AnalysisException:
expression 'characters.`url`' is neither present in the group by, nor is it an aggregate function
Add to Add to group by or wrap in first() if you don't care which value you get.;;
and then i attempt like this but i don't know this is right solution....
please answer this question. thanks you!
val characters = MongoSpark.load[sparkSQL.Character](sparkSession)
characters.createOrReplaceTempView("characters")
val testsql = sparkSession.select("SELECT * FROM characters")
testgrsql = testsql.groupBy("title")
testgrsql.show()
val testsql = sparkSession.sql("SELECT title FROM characters GROUP BY title"), If you are not using any aggregate functions.