3

I'm new to spark. I have tried to parse the below mentioned JSON file in spark using SparkSQL but it didn't work. Can someone please help me to resolve this.

InputJSON:

[{"num":"1234","Projections":[{"Transactions":[{"14:45":0,"15:00":0}]}]}]

Expected output:

1234 14:45 0\n
1234 15:00 0

I have tried with the below code but it did not work

val sqlContext = new SQLContext(sc)
val df = sqlContext.read.json("hdfs:/user/aswin/test.json").toDF();
val sql_output = sqlContext.sql("SELECT num, Projections.Transactions FROM df group by Projections.TotalTransactions ")
sql_output.collect.foreach(println)

Output:

[01532,WrappedArray(WrappedArray([0,0]))]
5
  • Can you share what have you already tried? We can help you finding bug in your code. Commented Feb 3, 2017 at 6:36
  • Added in the question. Commented Feb 3, 2017 at 6:51
  • Why are you collecting? Try sql_output.show() for the Dataframe Commented Feb 3, 2017 at 6:55
  • Also, you select 2 columns, but expect 3 columns? Commented Feb 3, 2017 at 6:57
  • I have tried with show and got the below response. +-----+--------------------+ | NUM| Transactions| +-----+--------------------+ |01532|[WrappedArray([0,...| +-----+--------------------+ I have no idea how to select the keys like 14:45 in sql query Commented Feb 3, 2017 at 7:08

1 Answer 1

2

Spark recognizes your {"14:45":0,"15:00":0} map as structure so probably the only way to read your data is to specify schema manually:

>>> from pyspark.sql.types import *
>>> schema = StructType([StructField('num', StringType()), StructField('Projections', ArrayType(StructType([StructField('Transactions', ArrayType(MapType(StringType(), IntegerType())))])))])

Then you can query this temporary table to get results using multiple exploding:

>>> sqlContext.read.json('sample.json', schema=schema).registerTempTable('df')
>>> sqlContext.sql("select num, explode(col) from (select explode(col.Transactions), num from (select explode(Projections), num from df))").show()
+----+-----+-----+
| num|  key|value|
+----+-----+-----+
|1234|14:45|    0|
|1234|15:00|    0|
+----+-----+-----+
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.