0

I have a file which looks exactly as below.

{"eventid" : "12345" ,"name":"test1","age":"18"}
{"eventid" : "12346" ,"age":"65"}
{"eventid" : "12336" ,"name":"test3","age":"22","gender":"Male"}

Think of the above file as event.json

The number of data objects may vary per line. I would like the following csv output. and it would be output.csv

eventid,name,age,gender
12345,test1,18
12346,,65
12336,test3,22,Male

Could someone kindly help me? I could accept the answer from an any scripting language (Javascript, Python and etc.).

5 Answers 5

2

This code will collect all the headers dynamically and write the file to CSV.

Read comments in code for details:

import json

# Load data from file
data = '''{"eventid" : "12345" ,"name":"test1","age":"18"}
{"eventid" : "12346" ,"age":"65"}
{"eventid" : "12336" ,"name":"test3","age":"22","gender":"Male"}'''

# Store records for later use
records = [];

# Keep track of headers in a set
headers = set([]);

for line in data.split("\n"):
    line = line.strip();

    # Parse each line as JSON
    parsedJson = json.loads(line)

    records.append(parsedJson)

    # Make sure all found headers are kept in the headers set
    for header in parsedJson.keys():
        headers.add(header)

# You only know what headers were there once you have read all the JSON once.

#Now we have all the information we need, like what all possible headers are.

outfile = open('output_json_to_csv.csv','w')

# write headers to the file in order
outfile.write(",".join(sorted(headers)) + '\n')

for record in records:
    # write each record based on available fields
    curLine = []
    # For each header in alphabetical order
    for header in sorted(headers):
        # If that record has the field
        if record.has_key(header):
            # Then write that value to the line
            curLine.append(record[header])
        else:
            # Otherwise put an empty value as a placeholder
            curLine.append('')
    # Write the line to file
    outfile.write(",".join(curLine) + '\n')

outfile.close()
Sign up to request clarification or add additional context in comments.

3 Comments

I'd probably go with this solution, now that you've added the handling of unknown headers to it. It would have been nice to know of that detail in advance.
{"eventid" : "12345" ,"name":"test1","age":18} {"eventid" : "12346" ,"age":65} {"eventid" : "12336" ,"name":"test3","age":22,"gender":"Male"} - If I pass age has integers it doesn't work and throws an error.
@mmenschig Error says outfile.write(",".join(curLine) + '\n') TypeError: sequence item 0: expected string, int found. You should be able to determine why its failing from this message. Please try, what do you think this error means?
2

Here is a solution using jq.

If filter.jq contains the following filter

  (reduce (.[]|keys_unsorted[]) as $k ({};.[$k]="")) as $o   # object with all keys
| ($o  | keys_unsorted), (.[] | $o * . | [.[]])              # generate header and data
| join(",")                                                  # convert to csv

and data.json contains the sample data then

$ jq -Mrs -f filter.jq data.json

produces

eventid,name,age,gender
12345,test1,18,
12346,,65,
12336,test3,22,Male

Comments

0

Here's a Python solution (should work in both Python 2 & 3). I'm not proud of the code, as there's probably a better way to do this (using the csv module) but this gives you the desired output.

I've taken the liberty of naming your JSON data data.json and I'm naming the output csv file output.csv.

import json

header = ['eventid', 'name', 'age', 'gender']

with open('data.json', 'r') as infile, \
     open('outfile.csv', 'w+') as outfile:

    # Writes header row
    outfile.write(','.join(header))
    outfile.write('\n')

    for row in infile:
        line = ['', '', '', ''] # I'm sure there's a better way
        datarow = json.loads(row)

        for key in datarow:
            line[header.index(key)] = datarow[key]

        outfile.write(','.join(line))
        outfile.write('\n')

Hope this helps.

2 Comments

Hi, this is great with one big problem (Which I wasn't probably clear about). How about if I don't know of the possible headers? let's say my next line of the data.json file has "address" now. Is there a way to tackle that ?
Ha, that changes the entire nature of this exercise. So you're saying you don't know what the actual header row is and if it's a new entry to append it to the end?
0

Using Angularjs with ngCsv plugin we can generate csv file from desired json with dynamic headers.

Run in plunkr

// Code goes here

 var myapp = angular.module('myapp', ["ngSanitize", "ngCsv"]);

 myapp.controller('myctrl', function($scope) {
   $scope.filename = "test";
   $scope.getArray = [{
     label: 'Apple',
     value: 2,
     x:1,
   }, {
     label: 'Pear',
     value: 4,
     x:38
   }, {
     label: 'Watermelon',
     value: 4,
     x:38
   }];


   $scope.getHeader = function() {
    var vals = [];
    for( var key in $scope.getArray ) {
    for(var k in $scope.getArray[key]){
      vals.push(k);
     }
     break;
    }
    return vals;
    
   };

 });
<!DOCTYPE html>
<html>
  <head>
    <link href="https://netdna.bootstrapcdn.com/bootstrap/3.0.0/css/bootstrap.min.css" rel="stylesheet">

    <script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.4.7/angular.min.js"></script>

   <script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.4.7/angular-sanitize.min.js"></script>
   
	<script src="https://cdnjs.cloudflare.com/ajax/libs/ng-csv/0.3.6/ng-csv.min.js"></script>
   

  </head>


  <body>

    <div ng-app="myapp">

      <div class="container" ng-controller="myctrl">

        <div class="page-header">

          <h1>ngCsv <small>example</small></h1>

        </div>
       
        

        <button class="btn btn-default" ng-csv="getArray" csv-header="getHeader()" filename="{{ filename }}.csv" field-separator="," decimal-separator=".">Export to CSV with header</button>

       
      </div>
    </div>
  </body>
</html>

Comments

-1
var arr = $.map(obj, function(el) { return el });
var content = "";
for(var element in arr){
    content += element + ",";
}

var filePath = "someFile.csv";
var fso = new ActiveXObject("Scripting.FileSystemObject");
var fh = fso.OpenTextFile(filePath, 8, false, 0);
fh.WriteLine(content);
fh.Close();

2 Comments

Sorry, I am not sure what you meant about. I want to dynamically capture the column headers.
initial file would be something like events.json and the output will be in output.csv. Do you follow ? Where would I pass the location of the event.json file on this script ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.