Convert Json array into csv using apache Nifi

Question

I'm looking to convert JSON with an array to csv format. The number of elements inside the array is dynamic for each row. I tried using this flow, ( attached the flow file xml on the post ).

GetFile --> ConvertRecord --> UpdateAttribute --> PutFile

Are there any other alternatives?

JSON format:

{  "LogData": {
"Location": "APAC",
"product": "w1"  },  "Outcome": [
{
  "limit": "0",
  "pri": "3",
  "result": "pass"
},
{
  "limit": "1",
  "pri": "2",
  "result": "pass"
},
{
  "limit": "5",
  "priority": "1",
  "result": "fail"
}  ],  "attr": {
"vers": "1",
"datetime": "2018-01-10 00:36:00"  }}

Expected output in csv:

location,   product,    limit,  pri,    result, vers,   datetime
APAC        w1          0       3       pass    1       2018-01-10 00:36:00
APAC        w1          1       2       pass    1       2018-01-10 00:36:00
APAC        w1          5       1       fail    1       2018-01-10 00:36:00

Output from the attached flow: LogData,Outcome,attr "MapRecord[{product=w1, Location=APAC}]","[MapRecord[{limit=0, result=pass, pri=3}], MapRecord[{limit=1, result=pass, pri=2}], MapRecord[{limit=5, result=fail}]]","MapRecord[{datetime=2018-01-10 00:36:00, vers=1}]"

ConvertRecord -- I am using JSONTreereader and CSVRecordSSetwriter configurations as below:

JSONTreereader Controler service config: CSVRecordSetwriter controller service config: AvroschemaRegistry Controller service config:

Avro schema : { "name": "myschema", "type": "record", "namespace": "myschema", "fields": [{"name": "LogData","type": { "name": "LogData", "type": "record", "fields": [{ "name": "Location", "type": "string"},{ "name": "product", "type": "string"} ]}},{"name": "Outcome","type": { "type": "array", "items": {"name": "Outcome_record","type": "record","fields": [ {"name": "limit","type": "string" }, {"name": "pri","type": ["string","null"] }, {"name": "result","type": "string" }] }}},{"name": "attr","type": { "name": "attr", "type": "record", "fields": [{ "name": "vers", "type": "string"},{ "name": "datetime", "type": "string"} ]}} ]}

mattyb · Accepted Answer · 2019-03-25 23:27:35Z

1

Try this spec in JoltTransformJSON before ConvertRecord:

  {
    "operation": "shift",
    "spec": {
      "Outcome": {
        "*": {
          "@(3,LogData.Location)": "[#2].location",
          "@(3,LogData.product)": "[#2].product",
          "@(3,attr.vers)": "[#2].vers",
          "@(3,attr.datetime)": "[#2].datetime",
          "*": "[#2].&"
        }
      }
    }
  }
]```

answered Mar 25, 2019 at 23:27

mattyb

12.2k20 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

vishwanatha handadi Over a year ago

Thanks a ton @mattyb for the help. This works like a charm. One question: If i have 1300 + array elements inside Outcome:[], will the jolt transform works well or will there be any overhead while converting 1 incoming json record to 1300+ json records?

mattyb Over a year ago

Not sure what you mean by overhead, but I can tell you that all 1300+ flow files will be transferred at the same time, once they are all done processing and the session is committed.

Óscar Andreu · Accepted Answer · 2019-03-24 12:17:25Z

1

Seems that you need to performa JoltTransform before convert to CSV, if not is not going to work.

answered Mar 24, 2019 at 12:17

Óscar Andreu

1,70014 silver badges32 bronze badges

Collectives™ on Stack Overflow

Convert Json array into csv using apache Nifi

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related