My understanding of spark windows is as follows:
current row (1) -> window rows (1 or more) -> aggregation func. -> output for the current row (1)
where a single row can be included in multiple windows. The aggregation function f is called with f.over(window), which limits the window scope to only a single function. For example, I cannot apply filter(), especially not a dynamic one, on only window rows before aggregating with sum().ower(window).
To do custom processing of the window rows, I can:
a) write UDF which gets window rows as input
b) use collect_list() to get window rows as a list for each row and continue processing on these lists
Is there any other option to use multiple standard spark functions on the same window rows?