I have sql query which I want to convert to spark-scala
SELECT aid,DId,BM,BY
FROM (SELECT DISTINCT aid,DId,BM,BY,TO FROM SU WHERE cd =2) t
GROUP BY aid,DId,BM,BY HAVING COUNT(*) >1;
SU is my Data Frame. I did this by
sqlContext.sql("""
SELECT aid,DId,BM,BY
FROM (SELECT DISTINCT aid,DId,BM,BY,TO FROM SU WHERE cd =2) t
GROUP BY aid,DId,BM,BY HAVING COUNT(*) >1
""")
Instead of that I need this in utilizing my dataframe
val GP = SU.groupBy("aid","DId","BM","BY").agg(countDistinct("aid","DId","BM","BY","TO").alias("count") > 1 ).show. Had registered as temp table but I don't want to use sql query