API JSON-response to CSV

Question

I would prefer to have this output in another format but I am struggling with how to get it.

CODE:

params = urllib.parse.urlencode({
})

try:
    conn = http.client.HTTPSConnection('api-extern.XXX.se')
    conn.request("GET", "/product/v1/product?%s" % params, "{body}", headers)
    response = conn.getresponse()
    data = response.read()
    json_data = json.loads(data)
    df = pd.io.json.json_normalize(json_data)
    df.to_csv(r'C:\Users\aaa.bbb\Documents\Python Scripts\file.csv', index=False, sep=';',encoding='utf-8')
    conn.close()

OUTPUT
all products for a site are merged to one cell

WANTED OUTPUT
Either:

all products in a row and all sites as columns
or all flat with first site 1 with all products as rows and then site 2 with all products.

Expected output alternative 1

Expected output alternative 2

The structure of the output is: [

  {
    "SiteId": "string",
    "Products": [
      {
        "ProductId": "string",
        "ProductNumber": "string"
      }
    ]
  }
]

Example

[{
    "SiteId": "0102",
    "Products": [{
        "ProductId": "12107708",
        "ProductNumber": "7070501"
      },
      {
        "ProductId": "15578",
        "ProductNumber": "26804"
      },
      {
        "ProductId": "15671",
        "ProductNumber": "600102"
      }
    ]
  }
]

Output from pprint(data) in Spyder is:

{'ProductId': '21831062', 'ProductNumber': '3364603'},
{'ProductId': '24432865', 'ProductNumber': '133101'},
{'ProductId': '24432978', 'ProductNumber': '1194515'},
{'ProductId': '1029420', 'ProductNumber': '198301'},
{'ProductId': '12282', 'ProductNumber': '408701'},
{'ProductId': '12946229', 'ProductNumber': '7174706'},
{'ProductId': '13278', 'ProductNumber': '42302'},
{'ProductId': '1028718', 'ProductNumber': '7536001'},
{'ProductId': '12945249', 'ProductNumber': '197404'},
{'ProductId': '16380', 'ProductNumber': '1133301'},
{'ProductId': '1866', 'ProductNumber': '257102'},
{'ProductId': '24420534', 'ProductNumber': '3422315'},
{'ProductId': '24424403', 'ProductNumber': '259301'},
{'ProductId': '10276', 'ProductNumber': '18004'},
{'ProductId': '1158212', 'ProductNumber': '689401'},
{'ProductId': '21775', 'ProductNumber': '395806'},

Can you share the raw response that you're getting so we can see what is happening? — sevenr
– sevenr, Commented Apr 13, 2020 at 21:52
I am not sure I fully understand what you mean with raw response or how to show you this (not experienced with python or Spyder). But if I add print(data) to my code, the response in Spyder is: — NewDev
– NewDev, Commented Apr 14, 2020 at 5:59
You can use the edit button below the question to make improvements to your question. Copy / paste the output from your print(data) there, and format it is as code so we can help better. — Martin Evans
– Martin Evans, Commented Apr 14, 2020 at 10:45
For some reason i can't edit my comment anymore. The output is very large so it give you a piece here: {"ProductId":"25077","ProductNumber":"694202"},{"ProductId":"369","ProductNumber":"44802"},{"ProductId":"358700","ProductNumber":"555201"},{"ProductId":"1084111","ProductNumber":"463801"} — NewDev
– NewDev, Commented Apr 14, 2020 at 12:04
Where does SiteId come from? It does not appear to be in your output of data — Martin Evans
– Martin Evans, Commented Apr 14, 2020 at 13:08

Martin Evans · Accepted Answer · 2020-04-15 10:33:00Z

1

Your first output with multiple site columns could be achieved just using the built in CSV module. To do this you need to first parse the whole file to determine all the required sites for the heading. Whilst doing this create a list of sites with given ProductId and ProductNumber combinations:

from collections import defaultdict
import csv


# Sample data
json_data = [
    {"SiteId" : "0102", "Products" : [{"ProductId" : "12107708", "ProductNumber" : "7070501"}, {"ProductId" : "15578", "ProductNumber" : "26804"}, {"ProductId" : "15671", "ProductNumber" : "600102"}]}, 
    {"SiteId" : "0104", "Products" : [{"ProductId" : "12107708", "ProductNumber" : "7070501"}, {"ProductId" : "15579", "ProductNumber" : "26804"}, {"ProductId" : "15671", "ProductNumber" : "600102"}]}
]

data = defaultdict(list)
sites = set()

for site in json_data:
    for product in site['Products']:
        site_text = f"site {site['SiteId']}"
        data[(product['ProductId'], product['ProductNumber'])].append(site_text)
        sites.add(site_text)

fieldnames = ['ProductId', 'ProductNumber', *sorted(sites)]

with open('output.csv', 'w', newline='') as f_output:
    csv_output = csv.DictWriter(f_output, fieldnames=fieldnames)
    csv_output.writeheader()

    for (id, number) in data:
        row = {'ProductId' : id, 'ProductNumber' : number}

        for site in data[(id, number)]:
            row[site] = 1

        csv_output.writerow(row)

So for the sample data given (with two sites), you would get the following output.csv output:

ProductId,ProductNumber,site 0102,site 0104
12107708,7070501,1,1
15578,26804,1,
15671,600102,1,1
15579,26804,,1

answered Apr 15, 2020 at 10:33

Martin Evans

46.9k17 gold badges88 silver badges104 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

NewDev Over a year ago

Great! Will try it later today and get back to you!

NewDev Over a year ago

Thanks a lot! Works perfect!!

Collectives™ on Stack Overflow

API JSON-response to CSV

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related