0

I have data frame as below first one in pyspark dataframe.
Without change to pandas Df, I need to chage the dataframe like the second one with map values.
Does anyone have an idea for this job?

    a   b
A   1   2
B   4   2

↓
    value
A   {a : 1, b : 2}
B   {a : 4, b: 2}

4
  • In plain pandas you'd use df.to_dict() Commented Nov 9, 2021 at 4:31
  • this is in pyspark dataframe! @smci Commented Nov 9, 2021 at 4:40
  • I know, I'm suggesting you find the equivalent. Even knowing the pandas equivalent should help you. Commented Nov 9, 2021 at 4:50
  • ...actually df.to_json() Commented Nov 10, 2021 at 0:45

1 Answer 1

1

You can create a struct and then convert it into a JSON field using to_json method.

Working Example

import pyspark.sql.functions as F

df = spark.createDataFrame([{"a": 1, "b": 2}, {"a": 4, "b": 2}])

df.withColumn("value", F.to_json(F.struct(*[F.col(c).alias(c) for c in df.columns]))).select("value").show()

Output

+-------------+
|        value|
+-------------+
|{"a":1,"b":2}|
|{"a":4,"b":2}|
+-------------+
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.