convert a column with json value to a data frame using scala spark

Question

I found several helpful answers but that were all converting son file to df, in my case, I have a df with columns with son in them, like this:

s-timestamp: 2019-10-10

content: {"META":{"testA":"1","TABLENAME":"some_table_name"},"PINACOLADA":{"sampleID":"0","itemInserted":"2019-10-10","sampleType":"BASE",}"

I need to normalize the content column, how can I do that.

What do you mean with normalization? You need to extract some columns from the json column into the initial df maybe? — abiratsis
– abiratsis, Commented Oct 12, 2019 at 7:41
probably similar to stackoverflow.com/questions/58037893/… — abiratsis
– abiratsis, Commented Oct 12, 2019 at 16:26

Charlie Flowers · Accepted Answer · 2019-10-11 23:55:38Z

0

Welcome! There are a few ways of dealing with JSON strings in Spark DF columns. You can use functions like get_json_object to extract specific fields from your JSON or from_json to transform the field into a StructType with a given schema. Another option is to use spark.read.json to parse and create a separate dataframe from the column's contents. Have a look at my solution here and let me know if it helps.

answered Oct 11, 2019 at 23:55

Charlie Flowers

1,3807 silver badges12 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

semper doctrina Over a year ago

Thanks, looks like I'll have same problem as the solution post, my data may be inconsistent. I am going to experiment using your approach. Thanks so much.

Collectives™ on Stack Overflow

convert a column with json value to a data frame using scala spark

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related