2

I am trying to read a json file in order to calculate some metrics in scala. I managed to read the file and to conduct some outer filters but I have troubles understanding how to filter nested lists and maps.

Here is an example code (the real json is longer):

  val rawData = """[
  {
    "technology": "C",
    "users": [
    {
      "rating": 5,
      "completed": false,
      "user": {
        "id": 11111,
        "paid": true
      }
    },
    {
      "rating": 4,
      "completed": false,
      "user": {
        "id": 22222,
        "paid": false
      }
    }
    ],
    "title": "CS50"
  },
  {
    "technology": "C++",
    "users": [
    {
      "rating": 3,
      "completed": true,
      "user": {
        "id": 33333,
        "paid": false
      }
    },
    {
      "rating": 5,
      "completed": true,
      "user": {
        "id": 44444,
        "paid": false
      }
    }
    ],
    "title": "Introduction to C++"
  },
  {
    "technology": "Haskell",
    "users": [
    {
      "rating": 5,
      "completed": false,
      "user": {
        "id": 55555,
        "paid": false
      }
    },
    {
      "rating": null,
      "completed": true,
      "user": {
        "id": 66666,
        "paid": false
      }
    }
    ],
    "title": "Course on Haskell"
  }
  ]"""

  val data = rawData.toString.split("\n").toSeq.map(_.trim).filter(_ != "").mkString("")

I managed to get a list containing the 3 titles:

import scala.util.parsing.json._
val parsedData = JSON.parseFull(data)
val listTitles = parsedData.get.asInstanceOf[List[Map[String, Any]]].map( { case e: Map[String, Any] => e("title").toString }  )

Here are my 3 questions:

  1. Is that a good approach to obtain the list of the 3 titles?
  2. How to obtain a List containing the number of paid users for each of the latter 3 titles?
  3. How to obtain a List containing the number of users who have completed the course for each of the latter 3 titles?

Thanks in advance for the help

1

3 Answers 3

1

As the other answer suggested, you should use the play-json library. It's really powerful and has tons of features including object mapping and parsing and error handling.

  import play.api.libs.json._
  import play.api.data.validation.ValidationError

  case class User(id: String, paid: Boolean)
  object User {
    implicit val format: OFormat[User] = Json.format[User]
  }

  case class UserCourseStat(rating: Int, completed: Boolean, user: User)
  object UserCourseStat {
    implicit val format: OFormat[UserCourseStat] = Json.format[UserCourseStat]
  }

  case class Data(technology: String, title: String, users: List[UserCourseStat])
  object Data {
    implicit val format: OFormat[Data] = Json.format[Data]
  }

  val jsString = """[{"technology":"C","users":[{"rating":5,"completed":false,"user":{"id":11111,"paid":true}},{"rating":4,"completed":false,"user":{"id":22222,"paid":false}}],"title":"CS50"},{"technology":"C++","users":[{"rating":3,"completed":true,"user":{"id":33333,"paid":false}},{"rating":5,"completed":true,"user":{"id":44444,"paid":false}}],"title":"Introduction to C++"},{"technology":"Haskell","users":[{"rating":5,"completed":false,"user":{"id":55555,"paid":false}},{"rating":null,"completed":true,"user":{"id":66666,"paid":false}}],"title":"Course on Haskell"}]"""

  val rowData: JsValue = Json.parse(jsString)

  rowData.validate[List[Data]] match {
    case JsSuccess(dataList: List[Data], _) =>
      val chosenTitles = List("Course on Haskell", "Introduction to C++", "CS50")

      //map of each chosen title to sequence of it's users
      val chosenTitleToUsersMap = chosenTitles.map { title =>
        title -> dataList.filter(_.title == title)
          .flatMap(_.users.map(_.user))
          .toSet
      }.toMap
      //map of each chosen title to sequence of it's paid users
      val chosenTitleToPaidUsersMap = chosenTitleToUsersMap.map { case (title, users) =>
        title -> users.filter(_.paid)
      }

      //Calculate users who have completed each of the chosen title
      val allUsers = dataList.flatMap(_.users.map(_.user)).toSet

      val usersWhoCompletedAllChosenTitles = allUsers.filter{ user =>
        chosenTitles.forall { title =>
          chosenTitleToUsersMap.get(title).flatten.contains(user)
        }
      }

    case JsError(errors: Seq[(JsPath, Seq[ValidationError])]) =>
      //handle the error case
      ???
  }

Regarding the 3 questions you had:

  1. Is that a good approach to obtain the list of the 3 titles?

I can see 2 unsafe operations there, asInstanceOf and e("title"), latter one is because of not using the .get(key) method of the Map, it will throw and exception if key not found.

  1. How to obtain a List containing the number of paid users for each of the latter 3 titles?

Evaluated above in val named "chosenTitleToPaidUsersMap"

  1. How to obtain a List containing the number of users who have completed the course for each of the latter 3 titles?

Evaluated above in val named "usersWhoCompletedAllChosenTitles"

Sign up to request clarification or add additional context in comments.

Comments

0

You can use play-json library to parse and retrieve the field you want. For example:

import play.api.libs.json.Json

val rawData1 = Json.parse("""[{"technology":"C","users":[{"rating":5,"completed":false,"user":{"id":11111,"paid":true}},{"rating":4,"completed":false,"user":{"id":22222,"paid":false}}],"title":"CS50"},{"technology":"C++","users":[{"rating":3,"completed":true,"user":{"id":33333,"paid":false}},{"rating":5,"completed":true,"user":{"id":44444,"paid":false}}],"title":"Introduction to C++"},{"technology":"Haskell","users":[{"rating":5,"completed":false,"user":{"id":55555,"paid":false}},{"rating":null,"completed":true,"user":{"id":66666,"paid":false}}],"title":"Course on Haskell"}]""")

val resultedList = (rawData1 \\ "title").toList.map(_.as[String])

Comments

0

I suggest you use the json4s library. It allows you to extract your data into case classes:

import org.json4s.jackson.JsonMethods.parseOpt
import org.json4s.DefaultFormats
implicit val formats = DefaultFormats

case class Tech(technology: String, users: Seq[TechUser], title: String)
case class TechUser(rating: Option[Int], completed: Boolean, user: UserInfo)
case class UserInfo(id: Int, paid: Boolean)

val rawData = """..."""
val Some(json) = parseOpt(rawData)
val Some(data) = json.extractOpt[List[Tech]]

After doing this, data is a regular Scala data structure that you can manipulate as you wish. For example, if you want to find for which title there is a user whose id is divisible by 5, it's done like this:

data.find(_.users.exists(_.user.id % 5 == 0)).map(_.title)
// Result: Some("Course on Haskell")

The answers to your three questions are one-liners just like this one, but I leave them to you as an exercise.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.