38

I have a RDD and I want to convert it to pandas dataframe. I know that to convert an RDD to a normal dataframe we can do

df = rdd1.toDF()

But I want to convert the RDD to pandas dataframe and not a normal dataframe. How can I do it?

2 Answers 2

48

You can use function toPandas():

Returns the contents of this DataFrame as Pandas pandas.DataFrame.

This is only available if Pandas is installed and available.

>>> df.toPandas()  
   age   name
0    2  Alice
1    5    Bob
Sign up to request clarification or add additional context in comments.

2 Comments

What is the difference between toDF() and toPandas()?
@jezrael, how to convert only first 10 rows of spark df to pandas df?
18

You'll have to use a Spark DataFrame as an intermediary step between your RDD and the desired Pandas DataFrame.

For example, let's say I have a text file, flights.csv, that has been read in to an RDD:

flights = sc.textFile('flights.csv')

You can check the type:

type(flights)
<class 'pyspark.rdd.RDD'>

If you just use toPandas() on the RDD, it won't work. Depending on the format of the objects in your RDD, some processing may be necessary to go to a Spark DataFrame first. In the case of this example, this code does the job:

# RDD to Spark DataFrame
sparkDF = flights.map(lambda x: str(x)).map(lambda w: w.split(',')).toDF()

#Spark DataFrame to Pandas DataFrame
pdsDF = sparkDF.toPandas()

You can check the type:

type(pdsDF)
<class 'pandas.core.frame.DataFrame'>

3 Comments

I think pdsDF = sparkDF.toPandas is missing the () to actually call the method. It should be: pdsDF = sparkDF.toPandas()
What is the difference between toDF() and toPandas()?
toDF() converts an RDD to a Spark DataFrame, and toPandas() converts a Spark DataFrame to a Pandas DataFrame. The two kinds of DataFrames are different types.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.