7

I'm trying to take a hardcoded String and turn it into a 1-row Spark DataFrame (with a single column of type StringType) such that:

String fizz = "buzz"

Would result with a DataFrame whose .show() method looks like:

+-----+
| fizz|
+-----+
| buzz|
+-----+

My best attempt thus far has been:

val rawData = List("fizz")
val df = sqlContext.sparkContext.parallelize(Seq(rawData)).toDF()

df.show()

But I get the following compiler error:

java.lang.ClassCastException: org.apache.spark.sql.types.ArrayType cannot be cast to org.apache.spark.sql.types.StructType
    at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:413)
    at org.apache.spark.sql.SQLImplicits.rddToDataFrameHolder(SQLImplicits.scala:155)

Any ideas as to where I'm going awry? Also, how do I set "buzz" as the row value for the fizz column?


Update:

Trying:

sqlContext.sparkContext.parallelize(rawData).toDF()

I get a DF that looks like:

+----+
|  _1|
+----+
|buzz|
+----+

2 Answers 2

9

Try:

sqlContext.sparkContext.parallelize(rawData).toDF()

In 2.0 you can:

import spark.implicits._

rawData.toDF

Optionally provide a sequence of names for toDF:

sqlContext.sparkContext.parallelize(rawData).toDF("fizz")
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks @LostInOverflow (+1) - I think I'm almost there, please see my update. I am getting a single-row DF, with the correct value in it ("buzz" string), but the column name is "_1"...thoughts?
Dataframe is like dataset in tabular format with column/title. In the first case, you created dataframe with no column name specified, so it assigns default columns as "_1", "_2".
How would this work in Java? sparkContext.parallelize takes two additional parameters: numSlices and ClassTag. The 2nd isn't clear to me.
0

In Java, the following works:

List<String> textList = Collections.singletonList("yourString");
SQLContext sqlContext = new SQLContext(sparkContext);
Dataset<Row> data = sqlContext
      .createDataset(textList, Encoders.STRING())
      .withColumnRenamed("value", "text");

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.