Using Spark dataframe , I need to compute the percentage by using the below complex formula :
Group by "KEY " and calculate "re_pct" as ( sum(sa) / sum( sa / (pct/100) ) ) * 100
For Instance , Input Dataframe is
val values1 = List(List("01", "20000", "45.30"), List("01", "30000", "45.30"))
.map(row => (row(0), row(1), row(2)))
val DS1 = values1.toDF("KEY", "SA", "PCT")
DS1.show()
+---+-----+-----+
|KEY| SA| PCT|
+---+-----+-----+
| 01|20000|45.30|
| 01|30000|45.30|
+---+-----+-----+
Expected Result :
+---+-----+--------------+
|KEY| re_pcnt |
+---+-----+--------------+
| 01| 45.30000038505 |
+---+-----+--------------+
I have tried to calculate as below
val result = DS1.groupBy("KEY").agg(((sum("SA").divide(
sum(
("SA").divide(
("PCT").divide(100)
)
)
)) * 100).as("re_pcnt"))
But facing Error:(36, 16) value divide is not a member of String ("SA").divide({
Any suggestion on implementing the above logic ?