1

I have json data like

{
"labels1":
     {"A":1,"B":2, "C":3},
"labels2":
     {"A":1,"B":2, "C":3},
}

and I want 3 output columns that say tagname, keyname,value. The final output will be like

tagname,key,value
labels1,A,1
labels1,B,2
labels1,C,3
labels2,A,1
labels2,B,2
labels2,C,3

How can I achieve this usecase, also the keys A,B,C are just sample and there can be multiple optional fields. Thanks in advance and let me know if any more information is required.

1 Answer 1

0

Try with inbuilt pyspark functions for this case like stack and unnest the struct to add as new columns.

Example:

from pyspark.sql.functions import *
json = """{"labels1":{"A":1,"B":2, "C":3},"labels2":{"A":1,"B":2, "C":3}}"""
df = spark.read.json(sc.parallelize([json]), multiLine=True)
df.select(expr("stack(2,'labels1',labels1,'labels2',labels2)")).\
  select(col("col0").alias("tagname"),col("col1.*")).\
  select("tagname",expr("stack(3,'A',A,'B',B,'C',C) as (key,value)")).show()

#+-------+---+-----+
#|tagname|key|value|
#+-------+---+-----+
#|labels1|  A|    1|
#|labels1|  B|    2|
#|labels1|  C|    3|
#|labels2|  A|    1|
#|labels2|  B|    2|
#|labels2|  C|    3|
#+-------+---+-----+

Another way by using unpivot function:

df.withColumn("n",lit(1)).\
  unpivot("n",["labels1", "labels2"],"new","new1").select(col("new").alias("tagname"),col("new1.*")).\
  unpivot("tagname",["A","B","C"],"key","value").\
  show()
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you so much for the response, but the issue is I have given only sample data. The keys A,B,C... can be optional and many more keys can be there sometimes. I have edited the question too now. Could you let me know how should I approach in that case.
you can create dynamic stack statement based on df.columns as explained here:stackoverflow.com/a/76314739/7632695
Thanks a lot :) it worked!! Also do you think if there is any other way possible.
AFAIK i think these are the possible ways, u may try with melt function from pyspark and i believe that is same as unpivot as mentioned in the answer!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.