Following up this question and dataframes, I am trying to convert a dataframe into a dictionary. In pandas I was using this:
dictionary = df_2.unstack().to_dict(orient='index')
However, I need to convert this code to pyspark. Can anyone help me with this? As I understand from previous questions such as this I would indeed need to use pandas, but the dataframe is way too big for me to be able to do this. How can I solve this?
EDIT:
I have now tried the following approach:
dictionary_list = map(lambda row: row.asDict(), df_2.collect())
dictionary = {age['age']: age for age in dictionary_list}
(reference) but it is not yielding what it is supposed to.
In pandas, what I was obtaining was the following:
