0

I have a JSON file that I'd like to convert to JSON Lines in spark scala. I was able to figure it out in Python by just using Pandas read_json method and then writing it with some special lines parameters passed in.

Say the original format is:

{
            "A": "400",
            "B": "100",
            "C": "DEM",
            "D": "USD",
            "E": "80029898",
            "F": "1.64110-",
            "G": "0 "

        },
        {
            "A": "400",
            "B": "100",
            "C": "USD",
            "D": "DEM",
            "E": "80029898",
            "F": "1.64110 ",
            "G": "0 ",

        },

I'd like to write is as:

{"A":"400","B":"100","C":"DEM","D":"USD","E":"80029898","F":"1.64110-","G":"0"}
{"A":"400","B":"100","C":"USD","D":"DEM","E":"80029898","F":"1.64110 ","G":"0"}

Thanks so much and have a great day!

3
  • The input format is a comma separated list of JSONs or should there be brackets [...] ? Commented Jan 26, 2021 at 16:24
  • comma separated list of JSONs Commented Jan 26, 2021 at 16:40
  • JSONs can be read using spark.read.json(), and written using df.write.json(). Refer this link for more info and to build on your case Commented Jan 26, 2021 at 17:04

1 Answer 1

1

If you are using > spark 2.2, you can use :

spark.read
  .option("multiLine", true).option("mode", "PERMISSIVE")
  .json("/path/to/user.json")

and then use to write it in desired format:

df.write.json()
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.