I have a sparse vector column obtained through OneHotEncoder in a spark dataframe, basically looking like this showing the first 10 rows:
+------------------------------------+
|check_indexed_encoded |
+------------------------------------+
| (3,[2],[1.0])|
| (3,[0],[1.0])|
| (3,[2],[1.0])|
| (3,[2],[1.0])|
| (3,[2],[1.0])|
| (3,[2],[1.0])|
| (3,[2],[1.0])|
| (3,[2],[1.0])|
| (3,[2],[1.0])|
| (3,[0],[1.0])|
+------------------------------------+
only showing top 10 rows
I am trying to access these elements to basically convert it back into (normally) hot encoded dummies to be able to convert the entire frame without issues into Pandas. Within spark I tried using .GetItem and .element but this throws also an error message "Can't extract value: need struct type". Any ideas how to get the values from that? Thanks!