2

I have a quite large array of hashes (stored in "@hash["response"]["results"])" returned by my program in JSON format.

I have seen several examples on Stack Overflow on how to convert a simple hash to CSV format, however I haven't been able to find any complex examples of doing it with a larger dataset.

I would like to use the hash keys ("pluginID", "ip", "pluginName", etc.) as the CSV headers and the hash values ("11112", "100.100.100.100", "Name for plugin here", etc.) for the CSV row content.

Note that the "repository" key is a hash itself and for that I'd like to just use the name, as opposed to the ID or description.

Any help is greatly appreciated. I have played with some code samples following the Ruby CSV standard library instructions but I am not even getting close.

@hash = '{
  "type": "regular",
  "response": {
    "Records": "137",
    "rRecords": 137,
    "startOffset": "0",
    "endOffset": "500",
    "matchingDataElementCount": "-1",
    "results": [
      { "pluginID": "11112",
        "ip": "100.100.100.100",
        "pluginName": "Name for plugin here",
        "firstSeen": "1444208776",
        "lastSeen": "1451974232",
        "synopsis": "synopsis contents",
        "description": "Full description would go here... Full description would go here... Full description would go here... Full description would go here... Full description would go here...",
        "solution": "",
        "version": "Revision: 1.51",
        "pluginText": "output text here",
        "dnsName": "name",
        "repository": {
          "id": "1",
          "name": "Name Here As Well",
          "description": "Description here also"
        },
        "pluginInfo": "11112 (0/6) Name for plugin here"
      },
      { "pluginID": "11113",
        "ip": "100.100.100.100",
        "pluginName": "Name for plugin here",
        "firstSeen": "1444455329",
        "lastSeen": "1451974232",
        "synopsis": "Tsynopsis contents",
        "description": "Full description would go here... Full description would go here... Full description would go here... Full description would go here... Full description would go here...",
        "solution": "",
        "version": "Revision: 1.51",
        "pluginText": "output text here",
        "dnsName": "name here",
        "repository": {
          "id": "1",
          "name": "Name Here As Well",
          "description": "Description here also"
        },
        "pluginInfo": "11112 (0/6) Name for plugin here"
      },
      { "pluginID": "11113",
        "ip": "100.100.100.100",
        "pluginName": "Name for plugin here : Passed",
        "firstSeen": "1444455329",
        "lastSeen": "1444455329",
        "synopsis": "nope, more synopsis data here",
        "description": "Uanother different description",
        "solution": "",
        "version": "Revision: 1.14",
        "pluginText": "",
        "dnsName": "name here",
        "repository": {
          "id": "1",
          "name": "Name Here As Well",
          "description": "Description here also"
        },
        "pluginInfo": "11114 (0/6) Name for plugin here : Passed"
      },
      { "pluginID": "11115",
        "ip": "100.100.100.100",
        "pluginName": "Name for plugin here",
        "firstSeen": "1444455329",
        "lastSeen": "1444455329",
        "synopsis": "Tsynopsis contents",
        "description": "Full description would go here... Full description would go here... Full description would go here... Full description would go here... Full description would go here...",
        "solution": "",
        "version": "Revision: 1.51",
        "pluginText": "output text here",
        "dnsName": "",
        "repository": {
          "id": "1",
          "name": "Name Here As Well",
          "description": "Description here also"
        },
        "pluginInfo": "11116 (0/6) Name for plugin here"
      }
    ]
  },
  "code": 0,
  "msg": "",
  "msg_det": [],
  "time": 1454733549
}'

2 Answers 2

2

This is pretty easy. There are essentially five steps:

  1. Parse the JSON into a Ruby Hash.
  2. Get the key names from the first hash in the "results" array and write them to the CSV file as headers.
  3. Iterate over the "results" array and for each hash:

    1. Replace the "repository" hash with its "name" value.
    2. Extract the values in the same order as the headers and write them to the CSV file.

The code looks something like this:

require 'json'
require 'csv'

json = '{
  "type": "regular",
  "response": {
    ...
  },
  ...
}'

# Parse the JSON
hash = JSON.parse(json)

# Get the Hash we're interested in
results = hash['response']['results']

# Get the key names to use as headers
headers = results[0].keys

filename = "/path/to/output.csv"

CSV.open(filename, 'w', headers: :first_row) do |csv|
  # Write the headers to the CSV
  csv << headers

  # Iterate over the "results" hashes
  results.each do |result|
    # Replace the "repository" hash with its "name" value
    result['repository'] = result['repository']['name']

    # Get the values in the same order as the headers and write them to the CSV
    csv << result.values_at(*headers)
  end
end

This code (headers = results[0].keys) assumes that the first "results" hash will have all of the keys you want in the CSV. If that's not the case you need to either:

  1. Specify the headers explicitly, e.g.:

    headers = %w[ pluginId ip pluginName ... ]
    
  2. Loop over all of the hashes and build a list of all of their keys:

    headers = results.reduce([]) {|all_keys, result| all_keys | result.keys }
    
Sign up to request clarification or add additional context in comments.

8 Comments

thank you for the detailed response with explanation. Your code works as-is if the json variable is a string of JSON, however I used that as an example. I am trying to actually parse @hash["response"]["results"] which is an array of hashes, therefore I am getting "undefined method `keys'" because the keys method doesn't work on an Array. Would you be so kind to tune your answer to my scenario? I've tried for about 30 minutes but haven't been able to figure out how to make this work. To be clear, @hash is a Hash, but @hash["response"]["results"] is actually an array. Thanks much!
You want @hash["response"]["results"][0].keys, then.
If @hash is the already-parsed Ruby hash then the above code will work as-is if you skip everything before results = hash['response']['results'] and replace that line with results = @hash['response']['results'].
You are an absolute genius! Thank you so much. I thought I tried that exact thing but I guess I had some small variance of difference. Just curious, is there a way to specify an explicit order for the columns? I'd love to list them in a specific order that is different than how they are stored in the hash!! THANKS AGAIN! I will accept your answer as it is the most clear and explained (but thank you ProgNoob for your suggestion). Just not sure if I accept the answer if you won't see future comments, so I'll wait until I hear back.
I mentioned how to specify the headers (and their order) under “1.” at the end of my answer. And yes I'll still get notified about comments after an answer has been accepted.
|
2

I used solution like it:

stats_rows = @hash["responce"]["results"].each_with_object([]) do |e, memo|
  memo << [e["pluginID"], e["ip"], e["pluginName"]]
end
CSV.generate do |csv|
  csv << ["pluginID", "ip", "pluginName"] #puts your hash keys into SCV
  stats_rows.each do |row| #values
     csv << row
  end
end

2 Comments

Thanks so much for the quick response! I will try this out tomorrow and let you know if it does the trick (and mark the answer accepted if so). Thank you!
Thank you @ProgNoob for this answer. I'm sure it will help out others but due to the detailed response of the answer provided by Jordan, I decided to mark that as the accepted answer instead. Appreciate your help though very much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.