1

I have a list of integers and a sqlcontext dataframe with the number of rows equal to the length of the list. I want to add the list as a column to this dataframe maintaining the order. I feel like this should be really simple but I can't find an elegant solution.

1

1 Answer 1

1

You cannot simply add a list as a dataframe column since list is local object and dataframe is distirbuted. You can try one of thw followin approaches:

  • convert dataframe to local by collect() or toLocalIterator() and for each row add corresponding value from the list OR
  • convert list to dataframe adding an extra column (with keys from dataframe) and then join them both
Sign up to request clarification or add additional context in comments.

1 Comment

I ended up doing the second because collect or toLocalIterator would have overwhelmed the memory. The trouble was that it took me a while to figure out how to do the second point, which is partly why I asked the question. I didn't ask this explicitly because I was hoping there was a more elegant way.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.