1

There are probably at least 10 question very similar to this, but I still have not found a clear answer.

How can I add a nullable string column to a DataFrame using scala? I was able to add a column with null values, but the DataType shows null

val testDF = myDF.withColumn("newcolumn", when(col("UID") =!= "not", null).otherwise(null))

However, the schema shows

root
 |-- UID: string (nullable = true)
 |-- IsPartnerInd: string (nullable = true)
 |-- newcolumn: null (nullable = true)

I want the new column to be string |-- newcolumn: string (nullable = true)

Please don't mark as duplicate, unless it's really the same question and in scala.

1
  • 2
    Try myDF.withColumn("newcolumn", lit(null).cast("string")). Commented Oct 17, 2019 at 23:06

2 Answers 2

3

Just explicitly cast null literal to StringType.

scala> val testDF = myDF.withColumn("newcolumn", when(col("UID") =!= "not", lit(null).cast(StringType)).otherwise(lit(null).cast(StringType)))

scala> testDF.printSchema

root
 |-- UID: string (nullable = true)
 |-- newcolumn: string (nullable = true)
Sign up to request clarification or add additional context in comments.

Comments

1

Why do you want a column which is always null? There are several ways, I would prefer the solution with typedLit:

myDF.withColumn("newcolumn", typedLit[String](null))

or for older Spark versions:

myDF.withColumn("newcolumn",lit(null).cast(StringType))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.