0

I'm looking to convert JSON with an array to csv format. The number of elements inside the array is dynamic for each row. I tried using this flow, ( attached the flow file xml on the post ).

GetFile --> ConvertRecord --> UpdateAttribute --> PutFile

Are there any other alternatives?

JSON format:

{  "LogData": {
"Location": "APAC",
"product": "w1"  },  "Outcome": [
{
  "limit": "0",
  "pri": "3",
  "result": "pass"
},
{
  "limit": "1",
  "pri": "2",
  "result": "pass"
},
{
  "limit": "5",
  "priority": "1",
  "result": "fail"
}  ],  "attr": {
"vers": "1",
"datetime": "2018-01-10 00:36:00"  }}

Expected output in csv:

location,   product,    limit,  pri,    result, vers,   datetime
APAC        w1          0       3       pass    1       2018-01-10 00:36:00
APAC        w1          1       2       pass    1       2018-01-10 00:36:00
APAC        w1          5       1       fail    1       2018-01-10 00:36:00

Output from the attached flow: LogData,Outcome,attr "MapRecord[{product=w1, Location=APAC}]","[MapRecord[{limit=0, result=pass, pri=3}], MapRecord[{limit=1, result=pass, pri=2}], MapRecord[{limit=5, result=fail}]]","MapRecord[{datetime=2018-01-10 00:36:00, vers=1}]"

enter image description here

ConvertRecord -- I am using JSONTreereader and CSVRecordSSetwriter configurations as below: enter image description here

JSONTreereader Controler service config: enter image description here CSVRecordSetwriter controller service config: enter image description here AvroschemaRegistry Controller service config: enter image description here

Avro schema : { "name": "myschema", "type": "record", "namespace": "myschema", "fields": [{"name": "LogData","type": { "name": "LogData", "type": "record", "fields": [{ "name": "Location", "type": "string"},{ "name": "product", "type": "string"} ]}},{"name": "Outcome","type": { "type": "array", "items": {"name": "Outcome_record","type": "record","fields": [ {"name": "limit","type": "string" }, {"name": "pri","type": ["string","null"] }, {"name": "result","type": "string" }] }}},{"name": "attr","type": { "name": "attr", "type": "record", "fields": [{ "name": "vers", "type": "string"},{ "name": "datetime", "type": "string"} ]}} ]}

2 Answers 2

1

Try this spec in JoltTransformJSON before ConvertRecord:

  {
    "operation": "shift",
    "spec": {
      "Outcome": {
        "*": {
          "@(3,LogData.Location)": "[#2].location",
          "@(3,LogData.product)": "[#2].product",
          "@(3,attr.vers)": "[#2].vers",
          "@(3,attr.datetime)": "[#2].datetime",
          "*": "[#2].&"
        }
      }
    }
  }
]```
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks a ton @mattyb for the help. This works like a charm. One question: If i have 1300 + array elements inside Outcome:[], will the jolt transform works well or will there be any overhead while converting 1 incoming json record to 1300+ json records?
Not sure what you mean by overhead, but I can tell you that all 1300+ flow files will be transferred at the same time, once they are all done processing and the session is committed.
1

Seems that you need to performa JoltTransform before convert to CSV, if not is not going to work.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.