I need to merge all the values of the dataframe's columns into a single value for each column. So the columns stay intact but I am just summing all the respective values. For this purpose I intend to utilize this function:
def sum_col(data, col):
return data.select(f.sum(col)).collect()[0][0]
I was now thinking to do sth like this:
data = data.map(lambda current_col: sum_col(data, current_col))
Is this doable, or I need another way to merge all the values of the columns?