1

I have a Dataset[(Long, String)] that contains an id and a json String It's built more or less like this:

val ids: Dataset[Long] = ...
val results = ids.mapPartitions( ids => {
   // Create http client
   .
   .
   ids.map( id => (id, getJsonById(id))
   }

If I run results.toDF it will create a dataframe with the id and a string with the json, but what I want to have is a Dataframe with the id and all columns that are in the json.

How can I achieve that?

Edit: I want to load the whole json as dataframe, not a particular field of it. Something like what sparkContext.read.json(jsonRDD: RDD[String]) would do.

Thanks

2
  • Something like stackoverflow.com/questions/39238367/…? Commented Mar 30, 2017 at 15:22
  • If I'm not wrong, with that I can create a new column with a value inside the json, but in my case I want to have the whole json structure in the dataframe Commented Mar 30, 2017 at 15:58

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.