I am looking for help with pyspark on adding a new column with matching list values.
I have a list of values with variable unique_ids
[Row(card_id=1), Row(card_id=2)]
for each value in the list, if the list value matches column value, then count the number of rows that matches the value and add then create a new column with count value
this is how I am getting the list
unique_ids = data.select('card_id').distinct().collect()
example df
| card_id |
|---|
| 1 |
| 1 |
| 2 |
| 1 |
| 2 |
| 1 |
required dataframe
| card_id | Count |
|---|---|
| 1 | 4 |
| 1 | 4 |
| 2 | 2 |
| 1 | 4 |
| 2 | 2 |
| 1 | 4 |
Thanks