1

I am trying to parse data from an api with python and requests.

SO Reference Python codecs and utf-8 bom error

Listed multiple references above as I have updated script with each error received.

import requests
import codecs
import json

r = requests.get(
    "https://api.tatts.com/sales/vmax/web/data/racing/2017/4/05/mr/")
data = json.load(codecs.open(r.json(), 'utf-8-sig'))
# reads = r.json()
# data = reads.decode('utf-8-sig')

with open('data.json', 'w') as f:
    json.dump(data, f)

I want to save the response from the api https://api.tatts.com/sales/vmax/web/data/racing/2017/4/05/mr/ to a file.json

Initially I received the below so applied codecs resolution from SO reference answer.

json.decoder.JSONDecodeError: Unexpected UTF-8 BOM (decode using utf-8-sig): line 1 column 1 (char 0)

this resolution from SO answer.

data = json.load(codecs.open(r.json(), 'utf-8-sig'))

Now I receive error that

TypeError: expected str, bytes or os.PathLike object, not dict

However I cannot resolve the typerror because I need to load using codecs to stop the ut8-sig error.

How can I parse and write from requests and avoid both errors?

EDIT

Updated using below answer, however fails to write the file to disk.

import requests
import codecs
import json

r = requests.get(
    "https://api.tatts.com/sales/vmax/web/data/racing/2017/4/05/mr/")
data = json.load(codecs.open(r.text, 'r', 'utf-8-sig'))

with open('data.json', 'w') as f:
    f.write(data)

Answer

import requests
import json

r = requests.get(
    "https://api.tatts.com/sales/vmax/web/data/racing/2017/4/05/mr/")

output = open('data.json', 'w')
output.write(r.text)
1
  • 1
    Is there a reason you need to use r.json()? Why not just write r.text straight to a file? Commented Apr 6, 2017 at 1:42

2 Answers 2

3

codecs.open opens a local file using a given encoding. codecs.decode will convert an in-memory object. So I think you're after:

data = json.load(codecs.decode(r.text, 'utf-8-sig'))

Note that I've used r.text which means the requests library will not attempt to do any parsing of its own. Unless you want to modify the data before saving though, you could just save the response directly to disk:

with open('data.json', 'w') as f:
    f.write(r.text)
Sign up to request clarification or add additional context in comments.

2 Comments

updated, needed to add 'r' in codecs call. However it still fails to write the json file instead printing it out in the console.
instead of r.text, I needed to use r.content because codecs.decode was expecting binary input. Otherwise it worked for me! Thanks @alex-taylor.
1

Answer your updated question. You did not reach the code of writing data to file, If you scroll up your output I believe the error you got is:

IOError: [Errno 63] File name too long:...

The first parameter of codecs.open(r.text, 'r', 'utf-8-sig') is filename, as you can find out following docs of codecs.open. I think Alex Taylor's answer is enough to write response to a file, but if you really need to decode the response, you could try:

data = codecs.decode(str(response.text), 'utf-8-sig')

Another error in your code: data = json.load(codecs.open(r.text, 'r', 'utf-8-sig')) make data to be type of unicode, you can't write an unicode object to file. you can just dump it to your file:

import requests
import json
import codecs

r = requests.get("https://api.tatts.com/sales/vmax/web/data/racing/2017/4/05/mr/")
data = codecs.decode(str(r.text), 'utf-8-sig')

with open('data.json', 'w') as f:
    json.dump(data, f)

And you can load it back later with code:

with open('data.json', 'r') as f:
    data = json.load(f)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.