Spark SQL - How to write DataFrame to text file?

Question

I am using Spark SQL for reading parquet and writing parquet file.

But some cases,i need to write the DataFrame as text file instead of Json or Parquet.

Is there any default methods supported or i have to convert that DataFrame to RDD then use saveAsTextFile() method?

Radu Ionescu · Accepted Answer · 2016-03-15 12:45:29Z

16

Using Databricks Spark-CSV you can save directly to a CSV file and load from a CSV file afterwards like this

import org.apache.spark.sql.SQLContext

SQLContext sqlContext = new SQLContext(sc);
DataFrame df = sqlContext.read()
    .format("com.databricks.spark.csv")
    .option("inferSchema", "true")
    .option("header", "true")
    .load("cars.csv");

df.select("year", "model").write()
    .format("com.databricks.spark.csv")
    .option("header", "true")
    .option("codec", "org.apache.hadoop.io.compress.GzipCodec")
    .save("newcars.csv");

answered Mar 15, 2016 at 12:45

Radu Ionescu

3,5426 gold badges27 silver badges44 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

ajkl Over a year ago

should it be df.select("year", "model").write.format instead of df.select("year", "model").write().format ? Else you get a TypeError: 'DataFrameWriter' object is not callable error

Radu Ionescu Over a year ago

This is the official example provided for Spark 1.3. If you use Spark 1.4+ you should use df.select("year", "model").write.format as you suggested.

Igorock · Accepted Answer · 2018-05-19 04:12:19Z

3

df.repartition(1).write.option("header", "true").csv("filename.csv")

answered May 19, 2018 at 4:12

Igorock

2,9116 gold badges31 silver badges43 bronze badges

Collectives™ on Stack Overflow

Spark SQL - How to write DataFrame to text file?

2 Answers 2

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related