2

I tried below code but its not working:

df=df.withColumn("cars", typedLit(Map.empty[String, String]))

Gives the error: NameError: name 'typedLit' is not defined

3
  • was that imported before using? Commented Jun 23, 2022 at 13:34
  • @samkart no, I am unable to what to import for this Commented Jun 23, 2022 at 13:37
  • I just saw the tags -- pyspark does not have typedLit, but similar can be achieved using array and lit as described here Commented Jun 23, 2022 at 13:38

2 Answers 2

2

Create an empty column and cast it to the type you need.

from pyspark.sql import functions as F, types as T

df = df.withColumn("cars", F.lit(None).cast(T.MapType(T.StringType(), T.StringType())))
df.select("cars").printSchema()
root
 |-- cars: map (nullable = true)
 |    |-- key: string
 |    |-- value: string (valueContainsNull = true)
Sign up to request clarification or add additional context in comments.

Comments

1

Perhaps you can use pyspark.sql.functions.expr:

>>> from pyspark.sql.functions import *
>>> df.withColumn("cars",expr("map()")).printSchema()                                                                                                       
root
 |-- col1: string (nullable = true)
 |-- cars: map (nullable = false)
 |    |-- key: string
 |    |-- value: string (valueContainsNull = false)

EDIT:

If you'd like your map to have keys and/or values of a non-trivial type (not map<string,string> as your question's title says), some casting becomes unavoidable, I'm afraid. For example:

>>> df.withColumn("cars",create_map(lit(None).cast(IntegerType()),lit(None).cast(DoubleType()))).printSchema()                                      
root
 |-- col1: string (nullable = true)
 |-- cars: map (nullable = false)
 |    |-- key: integer
 |    |-- value: double (valueContainsNull = true)

...in addition to other options suggested by @blackbishop and @Steven. And just beware of the consequences :) -- maps can't have null keys!

8 Comments

Thanks you, One more question, what if i want to create map<int,int> ?
@RahulDiggi use my solution for that !
Or expr("cast(map() as map<int,int>)")..
@mazaneicha It should :) Note the difference: cast(map() as map<int,int>) creates an empty map whereas in the other solution it creates a NULL value of type map (it's equivalent to cast(null as map<int,int>)). Also, create_map function can't be used in this particular case as you can't pass null for keys.
@mazaneicha which version of spark are you using? I can execute the same code with spark 3.2, it works just fine.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.