I Have the below script (i've removed all the column names etc.. to make it easier to see what I am doing at a high level - it was very messy!!)
I need to add a column that is the equivalent of count(*) in SQL.
So if I have grouped user usage by domain I might see the below - where the count is the number of records that match all the prior column conditiosn.
domain.co.uk/ UK User 32433 domain.co.uk/home EU User 43464 etc...
I'm sure it's been asked somewhere on Stackoverflow before, but I've had a good look around and cant find any reference to it!
vpx_cont_filter = vpx_data\
.coalesce(1000)\
.join(....)\
.select(....)\
.groupBy(....)\
.agg(
....
)\
.select(....)