I have a dataset containing data like the following:
|c1| c2|
---------
| 1 | a |
| 1 | b |
| 1 | c |
| 2 | a |
| 2 | b |
...
Now, I want to get the data grouped like the following (col1: String Key, col2: List):
| c1| c2 |
-----------
| 1 |a,b,c|
| 2 | a, b|
...
I thought that using goupByKey would be an sufficient solution, but I can't find any example, how to use it.
Can anyone help me to find a solution using groupByKey or using any other combination of transformations and actions to get this output by using datasets, not RDD?